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A human motion following controller is provided by the 
invention to augment motion of items (e.g., computer cursor or 
scene view) shown on a computer display. The display is coupled 
to the computer which controls positioning of the items through 
operating system controls. A camera captures frames of data 
corresponding to a first image of at least part of a user (e.g., eyes, 
hands) at the computer display. Signal processing electronics 
coupled to the camera (a) detects differences between successive 
frames of data corresponding to motion of the first image, and 
(b) communicates differences information to the computer to 
reposition display of the items through the operating system 
controls. The items are thus repositioned on the display by an 
amount corresponding to the motion of first image. 
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HUMAN MOTION FOLLOWING COMPUTER MOUSE AND GAME 
CONTROLLER 



A portion of the disclosure of this patent document contains material which is 
5 subject to copyright protection. The copyright owner has no objection to the 
facsimile reproduction of any one of the patent document or the patent disclosure, as 
it appears in the Patent and Trademark Office patent file or records, but otherwise 
reserves all copyright rights whatsoever. 

Related Applications 

10 

This application is a continuing application of conamonly-owned and co- 
pending U.S. provisional application number 60/070,512, filed on January 6, 1998, 
and U.S. provisional application number 60/100,046, filed on September 11, 1998, 
each of which is incorporated herein by reference. 

15 

Background . 

The primary human interfaces to today's computer are the keyboard, to enter 
textual information, and the mouse, to provide control over graphical information. 
20 These interfaces help users with word processing, presentation software, computer 
aided design packages, spreadsheet analyses, and other applications. These 
interfaces are also widely used for computer gaming entertainment; though they are 
often augmented or replaced by a joystick. 

25 In daily use of business software applications, control of cursor position on 

the screen requires that the user remove his/her hand from the keyboard in order to 
use the standard mechanical mouse. The use of the mouse introduces several issues. 
In a desk environment, the mouse requires maintenance of space on the desk area. 
The mouse cord must also remain free from obstruction to facilitate movement. 

30 Additionally, the use of the mouse is a major contributing factor of carpal-tunnel 

I 
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syndrome. It would be advantageous therefore to find an alternative to the 
mechanical mouse. 

In computer gaming, game complexity generally requires control of the (i) 
mouse and keyboard, or (ii) joystick and keyboard. Further, gaming applications 
usually require control in several axes of motion, including forward motion, reverse 
motion, left turn, right turn, left strafe (slide), right strafe, upward motion, 
downward motion. To further complicate game maneuvers and control, many 
games permit viewing (within the game environment) in directions different from 
that in which the vehicle (e.g., the car, or person, simulated within the game) is 
moving, including up, down, left and right. These many complexities of motion in- 
fact increase or modify the complexity and enjoyment of the game. 

Nevertheless, these complexities require that the user have utmost dexterity 
and control of his/her body. One object of the invention, therefore, is to offer 
alternative approaches to human-computer interfaces for those incapable of using 
standard devices (e.g., mouse, keyboard and joystick) such as due to disability. 

Another object of the invention is to provide an alternative input device for 
laptop computers. Laptop computers are used in locations which do not allow the 
use of a mouse, in airplanes or during business meetings in which there is no room 
to operate the mouse. Through the use of either a clip on camera or a camera built 
into the laptop display, the laptop user can control the mouse position or use the 
camera for teleconferencing while on the road. 

Other objects of the invention are to replace or augment existing human 
computer interfaces to facilitate enhanced gaming and/or control within game 
environments. 

In the prior art, certain systems exist which attempt to reduce the amount of 
physical interaction required with game controllers. However, such systems are 
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prohibitively expensive to the general public as their costs are driven by techniques 
and algoritlmis which detect user head motion based upon a detectable target worn 
by the user. Other costly and cumbersome systems require the user to wear 
apparatus which emits or detects a signal. It is thus one other object of this invention 
5 to provide a system which detects user motion without the aid or augmentadon of 
artificial devices placed on the user operator. 

Another object of the invention is to provide a means of human control of a 
graphical computer interface through the physical motion of the user in order to 
10 control the activity of a cursor in the manner usually accomplished with a computer 
mouse. 

A further object of the invention is to provide additional degrees of freedom 
in the human computer interface in support of computer games and entertainment 
15 software. 

Yet another object of the invention is to provide dual use of teleconferencing 
and video electronics with gaming and computer control systems, 

20 These and other objects will be apparent in the description which follows. 

Summary of Invention 

As used herein, ''cursor" means a computer cursor associated with a 
25 computer screen. "Scene view" means the view presented on a computer display to 
a user. For example, one scene view corresponds to the scene presented to a user 
during a computer game at any given moment in time. The game might include 
displaying a scene whereby the user appears to be walking in a forest, and through 
trees. In another example, a cursor might also be visible in the scene view as a 
30 mechanism for the user to select certain events or items on the scene (e.g., to open a 
door in a game, or to open a folder to access computer files). 

3 
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As used herein, "camera'' refers to a solid state instrument used in imaging. 
Typically, the camera also includes optical elements which refract light to form an 
image on the camera's detector elements (typically CCD or CMOS). For example, 
5 one^ camera of the invention derives from a video-conferencing camera used in 
conjunction with Internet communication. 

In one aspect, the invention provides systems and methods to control 
computer cursor position (or, for example, the scene view or game position as 

10 displayed on the computer display) by motion of the user at the computer. A 
camera rests on or near to the computer, or built into the computer, and connects 
therewith to collect "frames" of data corresponding to images of the user. These 
images provide information about user motion, over time. Software within the 
computer assesses these frames and algorithmically adjusts cursor motion (or scene 

15 view, or mouse button, or some other operation of the computer) based upon this 
motion. The motion may be imparted by up-down or left-right motion of the user's 
head, by the user's hands, or by other motions presented to the video camera (such 
as discussed herein). In one aspect, a close up view of the users facial features is 
used to impart a translation in the cursor (or scene view) even through the features 

20 in fact rotate with the user's head. In yet another aspect, the rotation is used to 
generate a corresponding rotation in computer game scene imagery. 

In one aspect, the invention also provides a human factors approach to cursor 
movement in which the user's rate of motion determines the relative motion of the 
25 cursor (or scene view). By way of example, the faster the user's head travels over a 
set distance, the further the corresponding cursor movement over the same time 
period. 

In other aspects of the invention, the camera is either (a) a visible light camera 
30 utilizing ambient lighting conditions or (b) a camera sensitive in another band such 
as the near infrared ('TR"), the IR, or the ultraviolet ("UV") spectrum. In the latter 
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case (b), the illumination preferably emanates from a source such as an IR lamp 
which is beyond human sensory perception. The sensor is typically mounted facing 
the user so as to capture a picture of the user's face in the associated electromagnetic 
spectrum. The lamp is typically integrated with the camera housing so as to facilitate 
5 production and ease of consumer set-up. 

In one aspect, a system of the invention provides an IR camera (i.e., a camera 
which images infrared radiation) to image the user's face and to gauge the user's 
stress level associated with a game on the computer. As the user's intensity increases 

10 (such as in a fast moving computer game using a joystick or the methods discussed 
herein), the system detects increased heat intensity on the user's face, forehead or 
other body part by the imagery of the IR camera. This information is fed back into 
the game processor to provide further enhancement to the game. In this manner, the 
system gauges the user's reaction to the game and modifies game speed or operation 

15 in a meaningful way. For example, suppose such a system determined that a 
particular user was bored of the present game speed (a determination of boring can 
be made by assessing low IR output over large portions of the user's face). The 
computer processor and game software can then cooperate to increase the gaming 
speed and thereby increase this particular user's stress. Games of the invention are 

20 thus made and sold to users with varying intelligence, age and/ or computer 
familiarity; and yet the system always "pushes the envelope" for any given user so 
as to make the game as interesting as possible, automatically. 

In accord with one aspect of the invention, images captured by the sensor are 
25 processed by a digital signal processor ("DSP") located either (a) in a PC card within 
the host computer or (b) in a housing integrated with the sensor. In case (a), sensor 
frames are sent to the PC card; and detected user motion (sometimes denoted herein 
as "difference information") is communicated to the user's operating system via a 
PCI (or USB or later standard) bus interface. These difference information 
30 commands are interpreted by a low overhead program resident at the user's main 
processor, which either updates the cursor position on the screen or provides motion 
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information to the user's computer game (e.g., so as to change the scene view). In 
case (b), the DSP is contained within the camera housing; and frames are processed 
local to the camera to determine difference information. This information is then 
transmitted to the computer by a cable that connects to a bus port of the computer so 
5 that the host processor can make appropriate movements of the cursor or scene 
view. In another aspect, the DSP is mounted in the camera housing such that the 
camera/signal processing subsystem produces signals which emulate the mouse via 
the mouse input connector. 

10 In an alternative configuration, frames of image data are sent directly to the 

host computer through the computer bus; and that image data is manipulated by the 
computer processor directly. With increasing computer processing speed, it is 
expected that sensor data frames can be sent directly to the host processor for all 
processing needs, in which case the PC card and/or separate DSP are not required. 

15 Although this is possible today, the update rates are likely too slow for practicality. 
Once GHz processors are on the market, a separate DSP may no longer be needed. 

In one aspect of the invention, pixel format or pixel density of the camera 
drives the accuracy of the system. Higher pixel density in the image of the user's 

20 face, for example, increases the attainable resolution and cursor control (or the 
attainable control of scene view motion). Camera formats of 240 vertical by 320 
horizontal generally provide satisfactory performance. The number of pixels that 
may be utilized is determined by system cost factors. Greater numbers of pixels 
require more powerful DSPs (and thus more costly DSPs) in order to process the 

25 image sequences in real time. Current technology limits the processing density to a 
64x64 window for consumer electronics. As prices are reduced, and power 
increases, the densities can increase to 128x128, 256x256 and so on. While 64x64 
density is satisfactory for general household users, a higher fidelity system using a 
greater number of pixels is possible, in accord with the invention, for higher end 

30 applications at a proportionally higher cost. Non-square pixel formats are also 
possible in accord with the invention, including a 64x128 detector array size. 
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In one aspect the data transfer rate from the camera is 30 frames/ second at 
240x320 pixels per frame. Assuming eight bits per pixel, the digital data trarisfer rate 
is therefore 18.432 mega bits/ second. This is a fairly high transfer rate for consumer 
5 products using current technology. While the data transfer can be either analog or 
digital the preferred method of image data transfer for this aspect is via a standard 
RS170 analog video interface. 

In accord with one aspect, a system of the invention defines two imaging 
10 zones (either within a single camera CCD or within multiple CCD cameras housed 
within a single housing). One imaging zone covers the user's head; and the other 
covers the user's eyes. This aspect includes processing means to process both zones 
whereby movement of the user's head provides one mechanism to control cursor 
movement (or scene view motion), and whereby the user's eyes provide another 
15 mechanism to control the movement. In essence, this aspect increases the degrees of 
freedom in the control decision making of the system. By way of example, a user 
might look left or right within a game without moving his head; but by assessing 
movement of the user's eyes (or the pupils of those eyes), the scene view can be 
made to rotate or translate in the manner desired by the user. Further, a user might 
20 move his head for other reasons, and yet not move her eyes from a generally 
forward looking position; and this aspect can assess both movements (head and 
eyes) to select the most appropriate movement of the cursor or scene view, if any. 

In another aspect, a system of the invention utilizes a camera with zoom 
25 optics to define the user's pupil and to make cursor or scene views move according 
to the pupil. In another aspect, the system incorporates a neural net to "learn" about 
a user's eye movements so that more accurate movements are made, over time, in 
response to the user's eye movement. 

30 In still another aspect, a neural net is used to learn about other movements of 

the user to better specify cursor or scene view movement over time. 

7 
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In yet another aspect of the invention, a system is provided with two CCD 
arrays (either within a single camera body or within two cameras). The arrays 
connect with the user's computer by the techniques discussed herein. One CCD 
array is used to image the user's head; and the other is used to image the user's 
body. Motion of the user is then evaluated for both head and body movement; and 
cursor or scene view movement is adjusted based upon both inputs. 

In another aspect of the invention, a single CCD is used to image the user. 
However, alternate frames are zoomed, electronically, so that one frame views the 
user's head, and the next frame views the user's eyes. With the algorithm discussed 
herein, these separate frame sequences (one for the eyes, one for the head) are 
processed separately and evaluated, together, to make the most appropriate cursor 
or scene view movement. If for example, the system clocks at SOHz, then one set of 
frame sequences operates at 15Hz, and the other at 15Hz. However, the advantage is 
that two movement information sets can be evaluated to invoke an appropriate 
movement in the cursor or scene view. 

Those skilled in the art should appreciate that different frame rates can be 
used; and frame rates for either sequence (head or eyes) can occur at different rates 
too. Further, the separate frame sequences can utilize other body parts, e.g., the head 
and the hand, to have two movement evaluations. Alternatively, a separate camera 
(or CCD array) can be used to image other body parts, for example one camera for 
the head and one for the hand. 

The invention also provides methods for shifting cursor or scene views in 
response to user movement. In one aspect, the scene view shifts left or right when 
the user shifts left and right. In another aspect, the scene view rotates when the 
user's head rotates. This last aspect can be modified so that such rotation occurs so 
long as the eyes do not also rotate (in this situation, the user's head rotates, 
indicating that she wishes the scene view to rotate; but the eyes do not, indicating 
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that she still watches the game in play). In another aspect, the scene view rotates in 
response to the user's hand rotation (i.e., a camera or at least a CCD array of the 
system is arranged to view the player's hand). 

5 In another aspect, the invention provides a multi-zone player gaming system 

whereby the user of a particular computer game can select which zone operates to 
move the cursor or the scene view. By way of example, the system can include one 
zone corresponding to a view of the user's head, where frames of data are captured 
by the system by a camera. Another zone optionally corresponds to the user's hand. 

10 Another zone optionally corresponds to the user's eyes. Each zone is covered by a 
camera, or by a CCD array coupled within the same housing, or by optical zoom 
zones within a single CCD,, or by separate optical elements that image different 
portions of the CCD array. By way of example, two zones can be covered with a 
single CCD array (i.e., a camera) when the zones are the user's head and eyes. The 

15 camera images the head, for one zone, and images the eyes in another zone, since the 
zones are optically aligned (or nearly so). However, two cameras (or optionally two 
CCD arrays with separate optics) can view two zones such as the user's head and the 
user's hand. Combinations of zones is also possible and envisioned in accord with 
the invention. 

20 

Zones in a single camera can also be identified by the computer by 
prompting the user for motion from corresponding body parts. For instance, the 
computer identifies the head zone by prompting the user to move his head. Then 
the computer identifies the foot zone by having the user move his foot. Once the 
25 zones are identified, the motion of each of these individual zones is tracked by the 
computer and the regions of interest in the camera image related to the zones moved 
as the targets in the zones move with respect to the camera. 

In one aspect, the invention provides a system, including a camera and edge 
30 detection processing subsystem, which isolates edges of the user's body, for 
example, the side of the head. These edges are used to move the cursor or scene 

9 
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view. For example, if the left edge of head is imaged onto column X of one frame of 
the CCD within the camera, and yet the edge falls in columii Y in the next frame, 
then a corresponding movement of the cursor or scene view is commanded by the 
system. For example, movement of the edge from one column to the next might 
5 correspond to ten screen pixels, or other magnification. In one aspect, this 
magnification is selected by the user. Up and down motion can also be detected by 
similar edge detection. For example, by imaging the user's chin, an edge movement 
in the up or down dimension is formed (e.g., if the bottom edge of the chin moves 
from one row to the next, in adjacent frames, then a corresponding movement of the 
10 cursor or scene view is made - magnification again preferably set manually with a 
default starting magnification). Other images can also serve to define edges. For 
example, in one aspect, a user's eyelash can be used to move the cursor (or scene 
view) up or downwards; though typically the eye blink is used to reset the cursor 
command cycle. 

15 

In one aspect, an optical matched filter is used to center image zones onto the 
appropriate images. For example, as discussed above, one aspect preferably utilizes 
64x64 pixels as the image frame from which cursor motion is determined. Many 
cameras have, however, many more pixels. These 64x64 arrays are therefore 

20 preferably established through matched filtering. By way of example, an image of a 
standard pair of user's eyes is stored within memory (according to one aspect of the 
invention). This image field is cross-correlated with frames of data from the actual 
image from the camera to "center" the image at the desired point. With eyes, 
specifically, ideally the 64x64 sample array is centered so as to view both eyes within 

25 the 64x64 array. Similarly, to process sequences of head data, a standard head image 
is stored within memory, according to one aspect, and correlated with the actual 
image to center the head view. 

Those skilled in the art should appreciate that an appropriate frame size can 
30 be established from an image having more or fewer pixels, by redundantly allocating 
data into adjacent pixels or by eliminating intermediate pixels, or similar technique. 

10 
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In another aspect, a camera is provided which optically ''zooms" to provide 
optimal imaging for a desired image zone. By way of example, the invention of one 
aspect takes an image of the user's head, determines the location of the user's eyes 
3 (such as by matched filtering), and optically zooms the image through movement of 
optics to provide an image of the eyes in the desired processing size format. 

Maiiy aspects of the invention are preferably enhanced by autofocus. 
Specifically, it is often desirable to have a crisp image of the user (or a part of the 
10 user, e.g., the user's eyes) in order to accurately process desired cursor or scene view 
movement. Thus, autofocus capability preferably operates in most of the aspects of 
the invention where imaging is a feature of the processing. 

In one aspect, the camera utilizes a very small aperture wliich results in a very 
large depth of field. In such a situation, autofocus is not required or desired. The 
15 optical requirements for the lenses are also reduced. 

The invention thus provides several advantages over the art. For example, 
game controllers can now include feedback corresponding to the user's actual 
movement. By way of another example, if the user moves left or right (or head or 
20 hand or eyes move left or right, depending on the image zone), then the cursor (or 
scene view) can also be set to move left or right. When the user twists her head, for 
example, the scene view can also be made to rotate, reflecting that movement. 

Those skilled in the art should appreciate that the direction in which the scene 
25 moves, left or right, is a matter of design choice. That is, certain games might find it 
desirable to move the opposite direction from what the user moves, to add certain 
challenges to the game. Further, in other aspects, this direction can change during 
the game to further complicate game control. 



30 In accord with one aspect of the invention, a processing subsystem (connected 

with the camera) is used to make cursor movement (or scene view movement) 

It 
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correspond to user's motion. This processing subsystena of another aspect further 
detects when the user twists his head, to add an additional dimension to the 
movement. 

5 In one aspect, a system of the invention includes an IR detector which is used 

to determine when a person sweats or heats up (by imaging, for example, part of the 
user's head onto the IR detector); and then the system adjusts game speed in a way 
corresponding to this movement. Alternatively, a heartbeat sensor is tied to the 
person to sense increased excitement during a game and the system speeds or slows 
10 the game in a similar manner. Note that a heartbeat sensor can be constructed, in one 
aspect of the invention, by thermal imaging of the user's face, detecting blood flow 
oscillations indicative of heartbeat. In other aspects, the heartbeat sensor is 
physically tied to the user, such as within the computer mouse or joystick. 

)5 In one aspect, a computer of the invention adapts to user control as selected 

by a particular user. For example, in the case of a handicapped person, a particular 
user might select certain hand-movements, e.g., a single finger up, to move the 
cursor up; and another finger down to move the cursor left. An infinite combination 
of controls can be established; however this is one advantage of the invention in that 

20 users with many different disabilities can program cursor or scene view movement. 
In one aspect, a neural network is used to assist the processing system in establishing 
proper cursor movement. In another aspect, the computer for example learns to 
print something by movement of the user's finger (or other body part). 

25 In one aspect, tipping of the user's head (or other body part, or object) is used 

to provide another degree of freedom in moving the cursor or adjusting the scene 
view. By way of example, a tilt of the head, as imaged by the camera, can be set to 
command a rotation of the scene view. 



30 



In still another aspect, a camera of the invention uses autozoom to move in 
and out of a given scene view. By way of example, the camera is first focussed on the 

12 



wo 99/35633 PCTAJS99/00086 

user's face in one frame; but in a subsequent frame the camera must focus to closer 
to compensate for the fact that the user moved closer to the camera (typically, the 
camera is on the monitor, so this also means that the user moved closer to the scene 
view). This autofzoom is used, in one aspect, to make the scene view appear as if the 
5 user is "creeping" into the scene. By moving the scene in and out, the user will 
perceive that he is moving in or out of the scene view. 

In another aspect, a camera images an object held by the user. Preferably, the 
object has a well-defined shape. The system images the object and determines 
10 difference information corresponding to movement of the object. By way of example, 
rotating the object upside down results in difference information that is upside 
down; and then the scene view inverts by operation of the system. In another 
example, twisting of the object rotates the scene view left or right, or rotates the 
scene in the direction of the twisting. 

15 

In another aspect, two cameras image the user: one camera pointed at the 
front of the users face or hand and the other down at the top of the users head or 
hand. The front facing camera is used to detect rotational and linear translation in 
up-down and left-right directions. The top viewing camera determined front-back, 

20 left right translation. The front-back translation observed by the top camera is used 
to control forward and back motion in the users 3-D view. The top sensed left-right 
translation controls the users left right slide or strafe. The top sensed left-right 
motion is removed from the front view left-right translation with the remaining 
front view measure representative of left-right twist. All of the front view up-down 

25 translation can be interpreted as up-down twist. 

Brief Description of the Drawings 

Figure 1 illustrates one human computer interface system constructed 
30 according to the invention; 
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Figure 1 A shows an exemplary computer display illustrating cursor 
movement made through the system of Figure 1; 

Figure IB illustrates overlayed scene views, displayed in two moments of 
time on the display in Figure 1, of a shifting scene made in response to user 
movement captured by the camera of Figure 1; 

Figure IC shows an illustrative frame of data taken by the system of Figure 1; 

Figure 2 illustrates selected functions for a printed circuit card used in the 
system of Figure 1; 

Figure 3 illustrates an algorithm block diagram that preferably operates with 
the system of Figure 1; 

Figure 4 illustrates one preferred algorithm process used in accord with the 
invention to determine and quantify body motion; 

Figure 5 shows one process of the invention for communicating body motion 
data to a host processor, in accord with the invention, for augmented control of 
cursor position or scene view; 

Figure 5 A shows representative frame of data of a user taken by a camera of 
the invention, and further illustrates adding symbols to key body parts to facilitate 
processing; 

Figure 6 illustrates a two camera imaging system for implementing the 
teachings of the invention; 
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Figure 7 illustrates two positions of a user as captured by a camera of the 
invention; and Figure 7A illustrates two positions of a scene view on a display as 
repositioned in response to movement of the user illusti-ated in Figure 7; 

Figure 8 illustrates motion of a user - and specifically twisting of the user's 
head - as captured by a system of the invention; Figure 8A illustrates a first scene 
view corresponding to a representative computer display before the twisting; Figure 
SB illustrates a second scene view corresponding to a rotation of the first scene view 
in response to the twisting by the user; Figure 8C shows processing features of the 
processing section of Figure 8; and Figure 8D illustrates multiple image frames 
stored in memory for matched filtering with raw images acquired by the system of 
Figure 8; 

Figure 9 illustrates a two camera system of the invention for collecting N 
zones of user movement and for repositioning the cursor or scene view as a function 
of the N movements; Figure 9A illustrates a representative thermal image captured 
by the system of Figure 9; and Figure 9B illustrates process methodology for 
processing thermal images as a real time input to game processing speed, in accord 
with the invention; 

Figure 10 illustrates another two camera system of the invention for targeting 
multiple image movement zones on a user, and further illustrating optional DSP 
processing at the camera section; 

Figure 11 illustrates framing multiple movement zones with a single imaging 
array, in accord with the invention; 

Figure 12 illustrates framing a user's eyes in accord with the invention; and 
Figure 12A shows a representative image frame of a user's eyes; 
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Figure 13 illustrates one system of the invention, including zoom, neural nets, 

and autofocus to facilitate image captures- 
Figures 14, 14A and 14B illustrate autofocus motion control in accord with 

the invention; 

Figures 15 and 15A illustrate one other motion detect system algorithm 
utilizing edge detection, in accord with the invention; 

Figure 16 illustrates one other motion detect system algorithm utilizing well- 
characterized object maniputions , in accord with the invention; 

Figure 17 illustrates one other motion detect system algorithm utilizing varied 
body motions, in accord with the invention; 

Figure 18 illustrates a two camera system of the invention with a camera 
observing the user's face while the other observes the top of the user's head; 

Figure 19 shows a blink detect system of the invention; and 

Figure 20 shows a re-calibration system constructed according to the 
invention. 

Detailed Description of the Drawings 

Figure 1 illustrates, in a top view, certain major components of a human 
computer interface system 10 of the invention. A user 12 of the system 10 sits facing 
a computer monitor 14 with display 14a. A camera 16 is mounted on the computer 
monitor 14 facing the user 12. In the illustrated embodiment, the camera 16 is 
mounted in such a way that the user's face 12a is imaged within the camera's field of 
view 16a. However, as discussed herein, the camera 16 can alternatively image other 

16 
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locations, such as the user's hand, eyes, or on other objects; so imaging of the user's 
face, in Figure 1, must be considered illustrative, rather than limiting. Further, the 
camera location can also reside at places other than on top of the monitor 14. 

5 With further regard to Figure 1, the camera 16 interfaces with a printed circuit 

card 18 mounted within the user's computer chassis 20 (which connects with the 
monitor 14 by common cabling 20a). The camera 16 interfaces to the printed circuit 
card 18 via a camera interface cable 22. The circuit card 18 also has processing 
section 18a, such as a digital signal processing ("DSP") chip and software, to process 

10 images from the camera 16. 

In operation, the camera 16 and card 18 capture frames of image data 
corresponding to user movement 25. The processing section 18a algorithmically 
processes the image data to quantify that movement 25; and then communicates this 
15 information to the host processor 30 within the computer 20. The host processor 30 
then coiTunands movement of the computer cursor in a corresponding movement 
25a, Figure lA (Figure lA illustrates a representative front view of the display 14a, 
and also illustrates movement 25a of the cursor 26 moving within the display 14a in 
response to user movement 25). 

20 

Figure IB illustrates an alternative (or supplemental) process whereby the 
scene view shifts in response to user movement 25. Specifically, Figure IB illustrates 
a first scene view 35a, which generally corresponds to a forest prior to the user s 
movement 25; and an overlayed scene view 35b (shown in dotted line, for purposes 
25 of illustration) that is shifted by an amount 37 in response to the user's movement 25. 
The shift 37 in the scene view 35 is accomplished by combined operation and 
processing of the processing section 18a and host CPU 30. 

Figure IC shows a representative frame 41 of data 43 as taken by the camera 
30 16. As illustrated, data 43 represents the user's face 12a taken at a given moment of 
time. Subsequent frames (not shown) are used to determine user motion 25 relative 
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to the frame 41, as discussed herein. The frame 41 is made up of the plurality of 
pixel data 45, as known in the art. 



Figure 2 illustrates certain functions processed within the printed circuit 
board 18 of Figure 1. A camera interface circuit 50 receives video data from the 
camera 16 through interface cable 22. This video data can be RS170 format or digital, 
for example. For analog RS170 format, circuit 50 decodes the analog video data to 
determine video timing signals embedded in the analog data. These timing signals 
are used for control of the analog-to-digital (A/D) converter included in circuit 50 
that converts analog pixel data into digital images. In the preferred embodiment, the 
analog data is digitized into 6-bits, though any number of bits greater may be 
acceptable and/ or required for features as discussed herein. For digital data format, 
camera interface 50 accepts the digital data without additional quantization, 
although interface 50 can digitally pre-process the digital images if desired to 
acquire desired image features. 

The frame difference electronics 52 receives digital data from the camera 
interface circuit 50. The frame difference electronics 52 include a multiple frame 
memory, a subtraction circuit and a state machine controller/ memory addresser to 
control data flow. The frame memory holds previously digitized frame data. As 
each digitized pixel is received by the frame difference electronics 52, the 
corresponding pixel from a previous frame is read from the frame memory and 
subtracted from the current frame. The preferred implementation uses the frame 
just previous to the current frame, though an older frame which resides in the frame 
memory may be used. The resulting difference is output to an N-frame video 
memory 54. The new frame pixel data is then stored into the frame memory of the 
frame difference electronics 52. 

The N frame video memory electronics 54 either receives differenced frames 
output by the frame difference electronics 52 (discussed above) or raw digitized 
frames from the camera interface 50. The choice of where the data derives from is 
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made by software resident on the DSP 56. The frame video memory 54 is sized to 
hold more than one full frame of video and up to N number of frames. The number 
of frames N is to be driven by hardware and software design. 

5 In the preferred embodiment, the DSP 56 implements an algorithm discussed 

below. This algorithm determines the rate of head motion of the user in two 
dimensions. The digital signal processor 56 also detects the eye blink of the user in 
order to emulate the click and double click action of a standard mouse button. In 
support of these functions, the DSP 56 commands the N frame video memory 54 to 

10 supply either the differenced frames or the raw digitized frames. The digital signal 
processor thus preferably utilizes a supporting program memory 58 made up of 
electrically reprogrammable memory (EPROM) and data memory 59 including 
standard volatile random access memory (RAM). The DSP 56 also interfaces to the 
PCI bus interface electronics 60 through which cursor and button emulation is 

15 passed to the user's main processor (e.g., the CPU 30. Figure 1). The PCI interface 60 
also passes raw digitized video to the main processor as an optional feature. 
Interface 60 also permits reprogramming of program memory 58, to allow for future 
software upgrades permitting additional features and performance. 

20 The PCI interface electronics 60 thus provides an industry standard bus 

interface supporting the aforementioned communication path between the printed 
circuit card 18 and the user's main processor 30. 

With optional MPEG compression electronics 62, the printed circuit card 18 
25 and camera 16 can provide compressed video to the user's main processor 30. This 
compressed video supports using the system 10 in teleconferencing applications, 
providing dual use as either human computer interface system 10 and/or as a 
teleconferencing system in an economical solution to two distinct applications. 

30 Figure 3 describes one preferred head motion block diagram algorithm 70 

used in accord with the invention. Not all of the functions shown in Figure 3 are 
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implemented in software in the DSP 56. Rather, this algorithm relies on the 
correlation of images from one frame to the next, and particularly relies on the use of 
frame differenced images in the correlation process. The frame differencin<r 

O 

operation removes parts of the camera images that are unchanged from the previous 
frame. For example, room background (such as object 13, Figure 1) behind the user 
12 is removed from the image. This greatly simplifies detection of feature motion. 
Even the image of the user's face image consists of regions of uniform illumination 
such that even with the user's facial motion, these uniform regions (i.e. cheeks, 
forehead, chin) may also be removed. The user's face 12a also consists of typically 
dynamic features such as the nose, eyes, eyebrows and mouth, each of which 
typically has enough spatial detail that will be evident in the differenced image. As 
the user moves his face with respect to room lighting, the shape and disti-ibution of 
these features will change; but the frame rate of the camera 16 ensures that these 
features look similar from one frame to the next. The correlation process therefore 
operates to determine how these differenced features are moving from one frame to 
the next in order to determine user head motion 25. 

The algorithm of block diagram 70, Figure 3, receives video images 72 of the 
user as imaged by camera 16 over time. Each received image is passed to both a 
frame memory 74 and a differencer 76. Though the preferred embodiment is to 
buffer a single frame in memory 74, the memory 74 may optionally store many 
frames, buffered such that the first frame input is the first frame output. The 
delayed frame is read from the frame memory 74 and subtracted from the current 
frame using the differencer 76. Frame output from the differencer 76 is provided to 
both a correlation process 78 and a difference frame memory 80. 

Like the frame memory 74, the preferred embodiment of frame memory 80 
utilizes a single difference frame; however the difference frame memory 80 can hold 
many difference frames in sequence for a finite time period. The delayed difference 
frame is read from the difference frame memory and provided to the correlation 
function 78. Difference frames are preferably selectable by system algorithms. The 

20 



wo 99/35633 



PCTAJS99/00086 



correlation process 78 determines the best combination of row and column shifts in 
order to minimize the difference between the current difference frame and the 
delayed difference frame. The number of rows and columns required to align these 
difference images provides information as to the user's motion. 

5 

The best-fit function algorithm 82 determines the row and column shift to 
provide optimal alignment. In the case of a classical correlation process, the best-fit 
function can consist of a peak detect algorithm. This algorithm may either be 
implemented in hardware or in software. 

10 

The best-fit function algoritlim determines relative motion in rows and 
columns of the observed user's features. The cursor update compute function 
algorithm 84 translates this measured motion into the position change required of 
the cursor (e.g.^ the cursor 26, Figure 1 A). Typically^ this is a non-linear process that, 
15 with greater head motion, the cursor moves a non-proportionally greater distance. 
For example a 1-pixel user motion can cause the cursor to move one screen pixel 
while a 10-pixel user motion may cause a 100-pixel screen cursor motion. However, 
these magnifications can be adjusted for desired result. This algorithm may either be 
implemented in hardware or in software such as through an ASIC or FPGA. 

20 

Video cursor control 86 provides a user interface to enable and disable the 
operation of cursor control described above. This control is implemented, for 
example, through a combination of keystrokes on the user's keyboard (for example 
as connected to the host computer 20, Figure 1). Alternatively, cursor control is 

25 activated or deactivated by sensing the eye-blink of the user (or some other 
predetermined movement). In this alternative embodiment, an output signal 85 from 
the correlation section 78 is sent to the video enable section 86; and the output signal 
85 corresponds to blink data from the user's face 12a (Figure lA). in another 
embodiment, the video cursor control section 86 activates or deactivates cursor 

30 control by recognizing voice commands. A microphone 87 detects the user's voice 
and a voice recognition section 89 converts the voice to certain activate or deactivate 
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signals. For example, the section 89 can be set to respond to ''activate" as a voice 
command that will enable cursor control; and "deactivate" as a command that 
disables cursor control. 

5 The functionality of the video cursor control 86 provides the user with the 

equivalent of a mouse pick-up, put-down action. As the user moves the cursor from 
left to right across the screen, the user would de-activate motion-based cursor 
control in order to allow the user to move his head back to the left. Once the user 
has recentered his head, the user would once again activate the cursor control and 
10 continue to move the cursor about the screen. The activation/ deactivation of the 
mouse input is represented by the switch 90, such that the open position of the 
switch disables human motion control of the cursor and supplies a zero change input 
to the summation operation 92 in such conditions. 

15 Those skilled in the art should appreciate that control of scene view may also 

be implemented by an algorithm such as shown in Figure 3. Specifically, a similar 
algorithm can provide movement of the current scene view, in accord with the 
invention. 

20 With video cursor control enabled, the result of the cursor update compute 

function 84 is added to the known current cursor position at the summing operation 
92. This surrxmation has a x component and a y component The result of the 
summation 92 is used to update the cursor position (or scene view) on the user's 
screen via the user's operating system. Cursor position may thus be controlled by 

25 both user motion as well as the motion imparted by another input device such as a 
standard computer mouse. 

Figure 4 provides a detailed description of the preferred implementation of 
the algorithm described in functions 73, 76, 78, 80 and 82 of Figure 3. Video data is 
30 received by the processing electronics in both a single frame memory 100 and a 
differencer 102. The output of the frame memory 100 is also provided to the 
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differencer 102 such that the previous frame is subtracted from the current frame. 
This differenced frame is than processed by a two dimensional FFT 104. The complex 
result of the FFT 104 is provided to a complex multiplier 106 and a complex memory 
108. The complex memory 108 is the size of the processed image, each location 

5 containing both a real and imaginary component of a complex number. With each 
new FFT operation 104, the previous FFT result, contained in the complex memory 
108, is provided to the conjugate operation 110. The complex conjugate of each 
element is computed and provided to the complex multiplier 106. In this manner, 
the FFT of the previous frame difference is conjugated and multiplied against the 

10 FFT of the current difference image. 

By way of comparison between, Figures 3 and 4, item 76 has similar 
functionality to item 102; item 78 has similar functionality to items 104, 106, 108, 110, 
112; item 80 has similar functionality to item 108; and item 82 has similar . 
15 functionality to item 114. 

The two dimensional array of complex products output by the complex 
multiplier 106 is provided to a two dimensional inverse FFT operation 112. This 
operation creates an image of the correlation function 114 between the latest pair of 
20 difference images. The correlation image is processed by the peak detection function 
114 in order to determine the shift required in aligning the two difference images. 
The x-y magnitude of this shift is representative of the user's motion. This x-y 
magnitude is provided to the software used to update the cursor position as 
described in Figure 3. 

25 

Figure 5 shows an algorithm process 130 of the invention and which applies 
motion correlation operations over sub-frames of the video image. This allows 
motions of various body parts to convey input with specialized meaning to 
applications operating on the host computer. In addition to head motion, motion of 
30 the hands, arms and legs provide for greater degrees of freedom for the user to 
interact with the host application (e.g., a game). Commands of this type are useful in 
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combative games where computer animated opponents fight under control of the 
user. In that instance, the hand, arm and leg motions of the user become punch, chop 
and kick commands to the computer after process 130. This command mode can 
also be used in situations where the user does not have ready access to a keyboard, 
to augment cursor control of the previously described head position correlator. 

Process 130 identifies the functions required to derive commands from 
general motions of the user's body. The scene analyzer function 132 receives 
digitized video frames from the camera (e.g., the camera 16 of Figure 1) and 
identifies sub-frames within the video for tracking various parts of the user's body. 
The frame difference function 134 and correlator function 136 provide similar 
functions as processes 74, 76 and 78 of Figure 3. The correlation analyzer 138 
receives correlated difference frames from the correlator function 136 and sub-frame 
definitions from the scene analyzer 132. The correlation analyzer 138 applies a peak 
detection function to each sub-frame to identify the shift required to achieve best 
alignment of the two images. Correlation peaks occurring in the center of the sub- 
frame indicate no motion, while peaks occurring elsewhere indicate the direction 
and magnitude of the user s motion. The motion interpreter 140 receives motion 
vectors for each sub-frame from the correlation analyzer 138. The motion interpreter 
140 links the motion vector from each sub-frame with a particular body segment and 
passes this information onto the host interface 142. The host interface 142 provides 
for communication with the host processor (e.g., CPU 30, Figure 1). It sends data 
packets to the host to identify detected body motions, their directions and their 
amplitudes. The host interface 142 also receives instruction from the host as to which 
body segments to track which it then passed along to the motion interpreter 140 and 
the scene analyzer 132. 

The scene analyzer 132 first identifies the location of the user's body in the 
image and locates the position of various parts of the user's body such as hands, 
forearms, head, and legs. The techniques and methods used to identify the user s 
body location and body part positions can be accomplished using techniques well 
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known to those skilled in the art (by way of example, via matched filtering). Body 
identification can also be augmented by marking different locations on the user s 
body with unique visual symbols. Unique symbols are assigned to key body joints 
such as elbows, shoulders, hands, neck, knees, and waist and are mounted on the 
5 body. See for example Figure 5A. 

Figure 5A illusti-ates one frame 149 of data of an image of the user 150 as 
taken by a camera of the invention. The image corresponds to a full body image of 
the user 150, including arms 151, legs 152, elbows 151a, hands 153, head 154, neck 
10 155, ears 156, and forhead 157. These parts 151-157 are identified by processes of the 
invention (e.g., spatial location in the image, by matched filtering or other image 
recognition technique), and the image is preferably marked with unique symbols 
(e.g., ''X" for center of the face, "Y" for center of the hand 153, "T" for center of the 
user's foot, "Z" for body center, and "F" for forehead 157). 

15 

With further reference to Figure 5, process 130 locates various body parts and 
preferably marks them with symbols to fill in connecting logic (e.g., the left wrist 
and left elbow symbol identify the location of the left forearm). Once the user's body 
parts are located, sub-frames surrounding each of the body segments identified by 

20 the host processor are generated. A sub-frame is a generally regularly shaped region 
within the image that surrounds a particular body part. The sub-frames are sized to 
center the subject body part in the sub-frame and to provide enough room around 
the body part to accommodate typical body motions. One sub-frame 160 is shown in 
frame 149, Figure 5A, surrounding the user's foot "T". The scene analyzer 132 will 

25 generally not operate on each frame of video since continuously changing the sub- 
frames adds unnecessary complication to the correlation analyzer 138. Instead, the 
scene analyzer 132 runs as a background process updating the sub-frame locations 
periodically. 

30 Figure 4 provides a detailed description of one algorithm which can be used 

to implement processes 134-138 of Figure 5. 

25 
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The invention of one embodiment can thus track the motion of the user's 
body using symbols attached to key joints. As an example, the position of the user's 
left lower arm can be determined by locating the unique symbol for the left hand 
5 "Yl" and left elbow Unique symbols thus allow the processor to rapidly locate 

each portion of the user's body in a video frame. To determine the motion of a 
particular part of the user's body, the algorithm (e.g.. Figure 4) compares the 
position of the relevant body parts in consecutive frames and determines how they 
moved (for example, using geometry). Once motion is determined, it is then passed 
10 to the host CPU where the motion is acted on as appropriate for the particular 
application. 

Figure 6 illustrates a two camera system 200, constructed according to the 
invention. The cameras 202a, 202b are arranged to view separate parts of the user: 

15 camera 202a images the user's face 204; and camera 202b images the user's hand 206. 
The cameras 202 conveniently rest on top of the computer display 208 coupled to the 
host computer 210 by cabling 216. The cameras 202 couple to the signal processing 
card 212 residing within the computer 210 by cabling 213. As discussed herein, 
motion of the user's head 204 and/ or hand 206 are detected by the signal processing 

20 card 212, and difference information is communicated to the computer's CPU 210a 
via the computer bus 214. This difference information corresponds to composite 
movement of the head 204 and hand 206; and is used by the CPU 210a to command 
movement of display items on the display 208 (for example, the display items can 
include the cursor or scene view as shown on the display 208 to the user). 

25 Information shown on the display 208 is conamunicated from the computer 210 to 
the display 208 along standard cable 216. 

Figures 7 and 7A illustrate how motion of a user's head is for example 
translated to motion of the cursor and/ or scene view, in accord with the invention. 
30 Figure 7 shows a representative image 220 of a user captured within a frame of data 
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by a camera of the invention. Figure 7 also shows a representative image 222 (in 
dotted outline, for clarity of illustration) of the user in a subsequent frame of data, 
indicating that the user moved "M" inches. Figure 7A illustrates corresponding 
scene views on a computer display 224 that is coupled to processing algorithms of 

5 the invention (i.e., within a system that includes a camera that captures the images 
220, 222 of Figure 7). The display 224 illustratively shows a scene view that includes 
a road 224a that extends off into the distance, and a house 224b adjacent to the road 
224a. A computer cursor 224c is also illustratively shown on the display 224 as such 
a cursor is common even within computer games, providing a place for the user to 

10 select items (such as the road or house 224a, 224b) within the display 224. The 
display 224 also shows, with dotted outlines 226, the scene view of road and house 
which are shown on the display 224 after motion by the user from 220 to 222, Figure 
7 (the cursor 224c is for example repositioned to position 224c'). The repositioning of 
the scene view from 224a, 224b to 226 occurs immediately (typically less than 1/30 

15 second, depending upon the camera) after the movement of the user of Figure 7 
from 220 to 222. The scene view is repositioned by x-pixels on the display 224, so 
that M/x corresponds to the magnification between user movement and scene view 
repositioning. This magnification can be set by parameters within the system; and 
can also be set by the user, if desired, at the computer keyboard. Furthermore, the 

20 rate at which the scene view moves the distance of x-pixels preferably occurs at the 
same rate as the rate of travel along distance M. Alternatively, the magnification can 
be dependent on the rate of motion such that a larger displacement of x-pixels will 
occur for a given motion M if the rate of change of M is larger. 

25 Figure 8 illustrates a further motion that can be captured by a camera of the 

invention and processed to reposition a scene view, as shown in Figures 8A and 8B. 
More particularly, Figure 8 illustrates a camera 250 connected to a processing section 
252 which converts user motion 254 to corresponding repositioning of the computer 
scene view. As above, the user 256 is captured by the camera's field of view 258 and 

30 frames of data are captured by the processing section 256. In Figure 8, motion 254 
corresponds to a twisting of the user's head 256; and processing section 252 detects 
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this twisting and provides repositioning information to the host computer (not 
shown). Processing section 252 can also incorporate head-translation motion (e.g., 
illustrated in Figure 15A) into the scene view movement above; and can similarly 
reject translational movement too, if desired, so that no scene motion is observed for 
5 translation of the user 256. 

Figure 8A shows a representative scene view 260 on a display 262 coupled to 
the host computer. Figure 8B illustrates repositioning of the scene view 260' after the 
processing section 252 detects motion 254 and updates the host computer with 
10 difference information (e.g., that information which the host computer uses to rotate 
or translate the scene view). 

Figure 8A also illustrates the intent of the rotating scene view feature. In 
Figure 8A, a person 260a is shown in the scene view 260, except that the person 260a 
15 is almost completely obscured by the edge 262a of the display 262. By twisting the 
head 256 in motion 254, the scene view 262 is rotated in the corresponding direction 
- as shown by scene view 260' in Figure 8B - so that the user 260a' is completely 
visible within the scene view 260'. 

20 Figure 8C illustrates further detail of the processing section 252. Camera data 

such as frames of images of a user are input to the section 252 at data port 266. The 
data are conditioned in the image conditioning section 268 (for example, to reduce 
correlated noise or other image artifacts). Thereafter, the camera data is compared 
and correlated in the image correlation section 270, which compares the present 

25 frame image with a series of stored images from the image memory 272. In the 
preferred embodiment, the present data image frame 249 is cross-correlated with 
each of the images within the image memory 272 to find a match. These images 
correspond to a series of images of the user in known positions, as illustrated in 
Figure 8D. 

30 

28 
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In Figure 8D, various images are stored representing various known positions 
of relevant part, here the user's head 256. In the position of Figure 8, for example a 
straight on face shot, the 0° stored memory image would provide the greatest cross- 
correlation value indicating a matched image position. Accordingly, the scene view 

5 would adjust to a zero position. If, however, the image correlated to a -90° position, 
the scene would rotate to such a position. Other movements cause additional scene 
view motions, including tilt and tip of the head, as shown in the two images "0°, 
Down 45**" image and the "0°, Up 45°". These images cause the scene view to move 
upwards or to tilt up or down, when the process section 252 correlates the current 

10 frame to these images. As indicated, these images have no left or right component, 
though other images (not shown) can certainly include left, right and tip motion 
simultaneously. 

Figure 9 shows a system 300 constructed according to the invention and 
1 5 including a camera section 302 including an IR imager 304 and a camera 306, both of 
which view and capture frames of data from a user 308. The IR imager 304 can 
include, for example, a microbolometer array (i.e., "uncooled" detectors known in 
the art) which produces a frame of data corresponding to the infrared energy 
emitted from the user, such as illustrated in Figure 9 A. Figure 9 A shows a 
20 representative frame of IR image data 310, with zones 312 of relatively hot image 
data emitting from regions of forehead, nose and mouth of the user 308. 

The cameras 304, 306 send image data back to the signal processing section 
314. Data, from the camera 306 is processed, if desired, as above, to determine 

25 difference information signal 322 used by a connected computer to reposition the 
cursor and/ or scene view. Data from camera 304, on the other hand, is used to 
evaluate how much (or how hot) zones 312 appear on the user during play of the 
computer. The signal processing section 314 assesses the zones 312 for temperature 
and/ or size over the course of a computer game and generate a ''game speed 

30 control" signal 320 which is communicated to the user's computer (i.e., that 
computer used in conjunction with the system 300 of Figure 9). The user's computer 
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processes the signal 320 to increase or decrease the speed of a computer game in 
process on the computer. 



Those skilled in the art should appreciate that the IR camera 304 can be used 
without the features of the invention which assess user movement. Rather, this 
aspect should be considered stand-alone, if desired, to provide active feedback into 
gaming speed based upon user temperature and/ or stress. Note that the camera 304 
can also be used to detect heartbeat since the zones 312 generally pulse at the user s 
heartbeat, so that heartbeat rate can also be considered as a parameter used in the 
generation of the game speed control signal 320. Alternatively, a pulse rate can be 
determined by known pulse rate systems that are physically connected to the user 
308. 

An IR lamp 324 can be used in system 300 to illuminate the user 308, with IR 
radiation 324a, such that sufficient IR illumination reflects off of the user 308 
whereby motion control of the cursor and/ or scene view can be made without the 
additional camera 306. The lamp 324 can be, and preferably is, made integrally with 
the section 302 to facilitate production packaging. 

An IR lamp 324 operating in the near-IR can also be used with visible cameras 
of the invention which typically respond to near-IR wavelengths. By way of 
example, certain camera systems now available incorporate six IR emitters around 
the lens to illuminate the object without distraction to the user who cannot see the 
near-IR emitted light. Such a camera is suitable for use with the invention. 

Figure 9B shows process methodology of the invention to process thermal 
user images in accord with the preferred embodiment of the invention. Specifically, a 
system such as system 300 first acquires a thermal image map in process block 326. 
This image is compared to a reference image f REF") in process block 327. REF can 
either be a temperature of the user (i.e., a temperature of one hot spot of a non- 
stressed user, or the temperature of one hot spot of the user at an initial, pre-game 
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condition) or an amount of the area 312, Figure 9A, of the user in a non-stressed 
condition or initial pre-game condition). By way of example, REF can be an image 
such as the frame 310 of Figure 9A. When the temperature or area of the region 312 
increases, the system 300 detects this change and determines that the image map 
5 exceeded the REF condition, as illustrated in process block 328. Should the map 
exceed the REF condition, the system 300 communicates this to the host processor 
which in turn adjusts the gaming speed, as desired. If the map does not exceed the 
REF condition, then the next IR image frame is acquired at block 326. 

10 System 300 and the process steps of Figure 9B are thus suitable to adjust 

gaming speed in real time, depending upon user stress level. In the preferred 
embodiment, the gaming speed is increased automatically such that the image map 
exceeds the REF signal for greater than about 50% of the time, so that all users, 
regardless of their ability, are pushed in the particular game. 

15 

Those skilled in the art should appreciate that multi-camera embodiments of 
the invention can and preferably are incorporated into a common housing 338, such 
as shown in Figure 10. Further, as illustrated in Figure 10, cameras can also be made 
from detector arrays 340, processing electronics 342, and optics 344. Each camera 

20 340, 342, 344 is constructed to process the correct electromagnetic spectrum, e.g., IR 
(using, for example, germanium lenses 344 and microbolometer detectors 340). Each 
camera has its own field of view 350a, 350b and focal distance 352a, 352b to image at 
least a part of the user. These field of views 350 can overlap, to view the same area 
such as the user's face, or they can view separate locations, such as the user's head 

25 and hand. 

Cameras of the invention can also include a DSP section 356 such as described 
above to process user motion data. The DSP section 356 processes user motion data 
and sends difference information to the user's host computer. The host computer 
30 thereafter repositions the cursor and/ or scene view based upon the difference 
information so that the user observes corresponding motion on the computer 
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display, as described above. Accordingly, the DSP section need not reside within the 
computer so long as difference information is isolated and communicated to the host 
computer CPU, 

Those skilled in the art should appreciate that algorithms of the invention can 
also be processed directly by the computer's CPU, provided it has sufficient power 
and processing speed, to eliminate a separate DSP chip or section, DSP or equivalent 
processing capability can also provided within the computer's chassis by way of a 
computer printed circuit card installed in the chassis and connected with the camera. 
The location and amount of processing power, therefore, should be considered a 
matter of design choice and current state of the art, each technique being within the 
scope of the invention. 

Figure 11 illustrates frame capture by one camera of the invention to isolate 
zones of imaging according to expected motion patterns. In Figure 11, one frame 370 
of data for example covers the user's eyes 371, corresponding to one image zone; and 
another frame 372 of data can cover the user's head 373, corresponding to another 
image zone. As mentioned previously, preferably the frames 370, 372 are 64x64 
pixels each, or 256x256 (or higher powers of two) to provide FFT capability on the 
image within the frame. A single camera can however provide both frames 370 and 
372, in accord with the invention. Specifically, a dense CCD detector array (e.g., 
480x740 pixels, 1000x1000 pixels, or higher) is used within the camera such that the 
whole array captures an image frame 376 of data, at least covering the available 
image format of the computer display 378. A matched filtering (or other image 
locate process) is processed on the frame 376 to locate the center 371a of the user's 
eyes (in the matched filtering process, an image data set of the user's eyes is stored in 
memory and correlated to the frame 376 such that a peak correlation is found at 
position 371a). Thereafter, a 64x64 array of data is centered about the eyes 371 to set 
the frame 370. To acquire the frame 372, every other pixel is discarded so that, again, 
a 64x64 array is set for the frame 372 (alternatively, each adjacent pair of pixels is 
added and averaged to provide a single number, again reducing the total number of 
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pixels to 64x64). Note that this process is reasonable since the width of the eyes is at 
least Vi the width of the user's face. Nevertheless, further compression can be 
obtained by utilizing every third pixel (or averaging three adjacent pixels) to obtain 
a larger image area in the frame 372. Note that the compression in the width and 
5 length dimensions need not be the same. 

Framing of the information in Figure 11 can occur in several ways. Most 
cameras image at 30Hz so that image motion is smooth to the human eye. hi one 
embodiment, one frame 370 is taken in between each frame 372, to minimize data 
10 throughput and processing; and yet to maintain dual processing of the two zones 
imaged in Figure 11. Alternatively, both frames 370, 372 are processed concurrently 
since frame 376 is typically the 30Hz frame. 

Figure 11 also illustrates how framing can occur around the user's eyes 371 to 
15 acquire "blink" information to reset cursor control. A blink detected by the user's 
eyes in frame 370 (or other frame) can be used to (a) disable or enable control of 
cursor or scene movement based upon user control, or (b) simulate pick-up and 
replacement of the computer mouse (i.e., reinitializing movement in a particular 
direction). For example, by detecting a blink of the eyes 371, a system of the 
20 invention can disable human motion following control such as described herein. 
Another blink can be used to enable human motion following control. Bhnking can 
also be used to continue motion in a particular direction. For example, movement of 
the cursor can be made to follow movement of the user's head, as described above. 
However, after a while, the person has to move to an unconafortable position to keep 
25 moving the cursor (or scene). A blink can thus also serve to reposition the head back 
to a normal starting position so that further movement in the desired direction can 
be made. 

Figure 12 illustrates a similar capture of a user's eyes 400, in accord with the 
30 invention. A frame 402 can thus be acquired by a camera of the invention. Figure 
12A illustrates further detail of one representative frame 402, illustrating that the 
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user s pupils 404 are also captured. Figures 3 and 4 describe certain algorithms of the 
invention that are also applicable to motion of the user's pupils 404, as illustrated by 
left and right motion 406 and up and down motion 408. Accordingly, by zooming in 
on the user's eyes, another movement zone is created that causes repositioning of the 
cursor or the scene view based upon the movements 406, 408, much like the head 
movement described and illustrated in Figures 1-4. 

Note that the teachings of Figures 1-4 and 12-12A can be combined within a 
two zone movement system so that, for example, both head motion and pupil 
motion can be evaluated for image motion. The cursor and/ or scene view can be 
repositioned, therefore, based upon movements from both zones. By way of 
example, repositioning of items within the display (e.g., the cursor and/ or scene 
view) can be made when the head moves but not if the head and eyes move which 
might indicate that the user is simply looking elsewhere in the room due to a 
distraction. However, if the user moves his head, but not his eyes, he is focussed on 
the game and intends rotation of the scene view, in another example. Other 
combinations are also possible. 

Cameras of the invention can also include zoom optics which (a) reduce or 
enlarge the image frame captured by a particular camera, or which (b) provide 
autofocus capability. Figure 13 shows one system 430 constructed according to the 
invention. A camera 432 includes camera electronics 432a and a zoom attachment 
432b. Data from the camera 432 is relayed to image and interpretation feedback 
electronics 434 for evaluation. For purposes of image magnification, the feedback 
electronics serve to evaluate a given image size relative to desired image goals. For 
example, to image the user's eyes with high fidelity might require high density of 
pixels at the user's eyes (e.g., at the zone 370, Figure 11). Accordingly, the system 430 
can isolate the user's eyes, such as described herein, and command the camera 
(through command lines 436) to increase or decrease magnification on the user's 
eyes so as to achieve desired resolution. The feedback electronics can also command 
motion of the camera to change its boresight aligmnent (i.e., to change where the 
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camera image is centered) by commanding movement of the camera when resting on 
one or more linear drives 438, as known in the art. 

Once aligned on the desired user location, e.g., on the eyes with desired 
5 accuracy, the system 430 continues processing data such as described herein to create 
human interface control of items displayed on the user's computer, e.g., cursor 
and/ or scene view. Accordingly, processing section 440 operates to detect user 
motion and to communicate difference information to the user's computer, as 
described above, 

10 

With regard to autofocus, the system 430 of Figure 13 can also be used to 
process user motion based upon motion towards and away from the camera. Figure 
14 illustrates such a system, including a camera 450 with autofocus capability to find 
the best focus 452 relative to a user 454 within the field of view 456. For example, 

15 when the user 454 moves to position 460 (the user being shown in outline form 
454a), the new best focus has changed to 452a. The camera 450 provides a signal 450a 
to the image interpretation and feedback electronics 434, Figure 13, which indicates 
where the user is along the "z" axis from the camera 450 to the user 454. This signal 
450a is thus used much like the other motion signals described herein, to move the 

20 cursor and/ or scene view in response to such movements. Figure 14 A illustrates a 
representative scene view 462 when, for example, the user is at best focus 452. The 
scene view 462 includes a house image 464 with a door 465. When the user moves to 
position 460, the house and door 464', 465' of the scene view 462' enlarge, since the 
user moved closer to the camera 450. Such a motion might reveal, for example, 

25 additional objects within the house, such as illustrated by object 466, Figure 14B. 
Accordingly, the autofocus feature of the invention provides yet another degree of 
freedom in motion control, in accord with the invention. 

Image data, manipulation, and human interface control can be improved, 
30 over time, by using neural net algorithms. As shown in Figure 13, a neural net 
update section 435 can for example couple to the feedback electronics 434 so as to 
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assimilate movement information and to improve data transmitted to the host 
computer, over time. Use of neural nets are known in the art. 

Figure 15 illustrates a frame of data 490 used in accord v^ith the invention to 
5 implement a simplified left, right, up, down movement algorithm to control cursor 
movement and/ or scene view movement. Frame 490 is captured by a camera of the 
invention; and preferably the camera incorporates autofocus, as described above, to 
provide a crisp image of the user 492 regardless of her position within the camera's 
field of view. As shown, image frame 490 provides very sharp edges to the user's 

10 face, including a left edge 494a, right edge 494b, and chin 494c. These edges need 
only approximate vertical or horizontal position. Movement of the user results in 
movement of the edges 494, such as shown in Figure 15A. Figure 15A shows that 
once such edges are acquired, they conveniently permit subsequent movement 
analysis and control of scene view and/ or cursor position. Specifically, Figure 15 A 

15 shows movement of the user's "edges" from 494a-c to 494a-c', indicating that the 
user moved left (as viewed from the camera's position) and that her chin raised 
slightly, indicating that an upward tilt of the head. This information is assessed by 
the process sections as discussed above and relayed to the host computer as 
difference information to augment or provide cursor and/ or scene movement in 

20 response to the user s movement. 

Note that such edge movements rougWy correspond to movement along 
rows and columns of the detector array. Detected movement from one row to 
another (or one column to another) can readily calculate the actual motion of the 
25 user from information of the user's best focus position and from the focal length of 
the camera's lens. This information may then be used to set the magnification of 
movement of items in the computer display (e.g., cursor and/or scene view). 

Figure 16 illustrates an image of one object 500 used in accord with the 
30 invention to provide image manipulation in response to motion of the object. The 
object 500. is held by the user 501 to manipulate motion of his cursor 502 and/ or 
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scene view 504 on his computer display 506. The object 500 is used because it 
exl^ibits an optical shape that is easily recognized through image correlation (such as 
matched filtering). In accord with the invention, a camera 510 is used to image the 
object 500; and frames of data are sent to the frame processor 512. The processor 512 

5 determines image position - relative to a starting position - and thereafter 
communicates difference information to the user's computer 505 along data line 514. 
The difference information is used by the computer's CPU and operating system to 
reposition items on the display 506 in response to motion of the object 500. Almost 
any motion, including rotation, tilting and translation are accomplished with the 

10 object 500 relative to a start position. This start position can be triggered by the user 
501 at the start of a game by commanding that the camera 510 take a reference frame 
("REF") that is stored in memory 513. The user 501 conunands that REF imagery be 
taken and stored through the keyboard 505a, connected to the computer 505, which 
in turn commands the processor 512 and camera 510 to take the reference frame REF. 

15 

Motion of the object 500 is thus made possible with enhanced accuracy by 
comparing subsequent frames of the object 500 with REF. When motion of rotation, 
tilt or translation are detected (for example, by using the techniques of Figures 2-4, 8- 
8D), then repositioning of items (502, 504) on the display 506 are follow that 
20 movement. 

The techniques of the invention permit control of the scene view and/ or 
cursor on a computer screen by motion of one or more parts of the user's body. 
Accordingly, as shown in Figure 17, complete motion of the user 598 can be 

25 replicated, in the invention, by correlated motion of an action figure 599 within a 
game. In Figure 17, user 598 is imaged by a camera 602 of the invention; and frames 
from the camera 602 are processed by process section 604, such as described herein. 
The user 598 is captured and processed, in digital imagery, and annotated with 
appropriate user segments, e.g., segments 1-6 indicating the user's hands, feet, head 

30 and main body. Motion of the segments 1-6 are communicated to the host computer 
606 from the process section 604. The computer's operating system then updates the 
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associated display 608 so that the action figure 599 (corresponding to an action figure 
within a computer game) moves like user 598. Accordingly, user motion of action 
figure 599 is made by the user 598 by performing stunts (e.g., striking and kicking) 
that he would like the action figure 599 to perform, such as to knock out an 
5 opponent within the display 608. 

In an alternative embodiment, icons can be used to simplify image and 
motion recognition of user segments such as segments 1-6. For example, if user 598 is 
marked with a star-shaped object on her hand (e.g., segment 1), then that star symbol 

10 is more easily recognized by algorithms such as described herein to determine 
motion. By way of another example, the hand of user 598 can be covered with a 
glove that has an symbol on the glove. That symbol can be used to more 
easily interpret user motion as compared to, for example, actually interpreting 
motion of the user's hand, which is rounded with five fingers. In a third example, 

15 user 598 can wear a article of clothing such as shirt 598a with a symbol 598b; and 
the invention can be used to track the icon 598b with great precision since it is a 
relatively easy object to track as compared to actual body parts. It should be 
apparent to those in the art that icons such as symbol 598b can be painted or pasted 
onto the individual too to obtain similar results. 

20 

Figure 18 illustrates a two camera system 700 used to determine translation 
and rotation. The forward viewing camera 702 observes the user's face 703 and 
determines the right-left (Axi) and up-down (Ayi) translation of the user s face 703. 
The top viewing camera 704 observes the top of the user s face or head 705 and 

25 determines the right-left (Ax2) and forward-backward motion (Ayo) of the user's face 
or head. The two cameras 702, 704 are each processed through motion sensing 
algorithms 706 using the teachings above, and results are shown on the computer 
display 710. For purposes of illustration, the display 710 shows an image of the user; 
while the image can be, for example, an action figure or other computer object 

30 (including the computer cursor), as desired, which follows tracking motions Axi, Ayi, 
Ax2, Ay2. As indicated in Figure 18, for example, Ay2 can be directly applied to 
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motion control of the user's forward and reverse motion (note, these motions are 
illustrated as within a computer display 710 as processed by algorithms 706). Axi 
can be directly applied to the users left-right sideways or strafe motion; Ayi can be 
directly applied to conti'ol the users up-down viewpoint, each as illustrated on 
5 display 710a. The results of the difference between Ax2 and Axi can be applied to 
control the user's left-right turn or viewpoint. 

The techniques of Figure 18 can be further extended to front, side and top 
view cameras for complete motion detection. The top camera determines the user's 
10 left-right, front-back motion while the front facing camera determines the user's 
rotational up-down, left-right motion. 

Figure 19 describes an algorithm to detect user eye bUnk. The video imagery 
is stored into a multiple frame buffer 800. The algorithm selects the current frame 

15 and a frame from the frame buffer and differences these frames using the adder 802. 
The difference frame consists of the pixel by pixel difference of the delayed frame 
and the current frame. The difference frame includes motion information used by 
the algorithms of teachings above. It also contains information on the user eye blink. 
The frames differenced by the adder 802 are separated temporally enough to ensure 

20 that one frame contains an image of the users face with the eyes open, the other 
image is of the user's face with the eyes closed. The difference image contains a two 
strong features, one for each eye. These features are spatially separated by the 
distance between the user's eyes. The blink detect function 808 inspects the image 
for this pair of strong features which are aligned horizontally and spaced within an 

25 expected distance based on the variation from one human face to another and the 
variation in seating distance expected from user to user. The recognition of the blink 
features may be accomplished using a matched filter or by recognition of expected 
frequency peaks in the frequency domain at the expected spatial frequency for 
human eye separation. The blink detect function 708 identifies the occurrence of a 

30 blink to a controlling function to either disable the cursor motion or take some other 
action. 
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Figure 20 illustrates a sound re-recalibration system 800 constructed 
according to the invention. As above, a camera 802 is arranged to view a user, a part 
of a user (e.g., a hand), or an object through the camera's field of view 804. A 
5 processing section 806 correlates the framing image data from camera 802 to induce 
movement of a scene view or cursor on the user's display 810. For purposes of 
illustration, the scene view or cursor is shown illustratively as a dot 808 on display 
810; and movement 812 of the cursor 808 from position 808a to 808b represents a 
typical movement of the cursor or scene view 808 in response to movement within 

10 the field of view 804, as described above. A re-calibration section 816 is used to reset 
the cursor or scene view 808 back to an initial position 808a, if desired. Specifically, 
in one embodiment, section 816 is a microphone that responds to sound 818 
generated from a sound event 818a, such as a snap of the user's fingers, or a 
particular word uttered by the user, to generate a signal for processing section 806 

15 along signal line 816a; and section 806 processes the signal to move the cursor or 
scene view^ 808 back to original position 808a. In another embodiment, re-calibration 
section 816 can also correspond to a processing section within the processing 
hardware/ software of system 800 to, for example, respond to the blink of a user's 
eyes to cause movement of the cursor 808 back to position 808a. 

20 

The invention thus attains the objects set forth above, among those apparent 
from the preceding description. Since certain changes may be made in the above 
methods and systems without departing from the scope of the invention, it is 
intended that all matter contained in the above description or shown in the 
25 accompanying drawing be interpreted as illustrative and not in a limiting sense. By 
way of example, although FFT correlation is often discussed in the above 
description, it should be apparent to those skilled in the art that other correlation 
techniques can be used with the invention to achieve similar results without 
departing from the scope hereof. , 

30 

The following Matlab source code provides non-limiting computer code 
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suitable for use to control the cursor on a display such as described herein. The 
Matlab source code thus provides an operational demonstration of the concepts 
described and claimed herein. The Matlab source code is platform independent and 
needs only a sequence of input images. It includes a centroid operation on the 

5 correlation peak which is not included on the PC version (described belov^), 
providing a finer measurement on the motion in the image. More particularly, the 
centroid operation provides a refinement on locating the correlation peak. The PC 
code, discussed below, uses the pixel location nearest the correlation peak while the 
centroiding operation improves the resolution of the peak location to levels below a 

10 pixel. 



% Copyright (C) 1998 

% Video Mouse Group Partnership 

% 

15 % This following script file reads in a sequence of images of a computer user's face. 
% It then processes the image sequence using the methods of difference 
% frame correlation processing used for a human-computer interace. 
% This code includes a centroiding operation and demonstrates the . 
% difference frame con-elation approach. 

20 

[filename,filepath]=uigetfile{'d:\videomouse\*.mat'); 

files=dir([filepath '*-mat']); 

previousFrame=zeros(64); 

inputMatrix 1 =zeros(64); 
25 inputMatrix2=zeros(64); 

cursorPosX(i)-0; 

cursorPosY(l)=0; 

for i=l:size(files,l); 
load([filepath Trame' num2str(i)]); 
30 camera=conv2(double(camera'),ones(2)/4,'same'); 
camera=camera(l :2:480,1 :2:640); 

camera=round(camera( 120-32:1 20+3 IJ 60-32 : 1 60+3 1 )/4); 
if mod(i,2) 

%Compute di fference frame 
35 inputMatrix2=camera-previousFrame; 

%Save current difference frame for next iteration 

previousFrame=camera; 

"/oFFT difference frame 

inputMatrix2=fft2(inputMatrix2); 
40 %Perfonn difference frame correlation by multiplying difference frame by 

Vocomplex conjugate of previous frame 
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. con-elationMatrix=0.00001*conj(inputMatrix2.*conj(inputMatrixl)); 
else 

%Compute difference frame 
inputMatrix 1 =camera-previousFrame; 
5 %Save current difference frame for next iteration 

previousFrame=camera; 
%FFT difference frame 
inputMatrix 1 =fft2(inputMatnx 1 ); 

%Perform difference frame correlation by multiplying difference frame by 
10 % complex conjugate of previous frame 

correlationMatrix=0.00001*conj(inputMatrixl.*conj(inputMatrix2)); 
end 

%Compute inverse FFT on correlation matrix 
coiTelationMatnx=i-eal(fft2(correlationMatrix)); 
15 %Shift con-elation matrix to center correlation peak 
temp=fftshift(correlationMatrix); 
%Find maximum value of correlation matrix 
correIationPeak=Tniax{correlationMatrix(:)); 

%Perfonn centroiding on correlation peak in order to fmd peak location 
20 %The centroiding operation is not currently incorporated into the application version 
mask=temp>.50*correlalionPeak( 1 ); 
maskSize(i)=sum(mask(:)); 
ifmaskSize(i)<100 
co]Sums=sum((temp-.50*correlationPeak(l)) *mask); 
25 xPos(i)=sum(colSums.*(-32:31))/sum(colSums); 

rowSums=sum((temp'-.50*correlationPeak(l)).*mask'); 
yPos(i)=sum(rowSums.*(-32:31))/sum(rowSums); 
else 
xPos(i)=0; 
30 yPos(i)=0; 
end 
ifi>l 

cursorPosX(i)=cursorPosX(i- 1 )+xPos(i); 
cursorPos Y(i)=cursorPosY(!- 1 )+yPos(i); 
35 end 
end 
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The following PC source code, labeled videoMouseDlg.doc and 
videoMouseDSP.doc, provide non-limiting and nearly operable DSP code for control 
of the cursor, as described herein. The code is not smooth; and there are other files 
required to compile this code to an executable, as will be apparent to those skilled in 
the art, including header files (*.h), resource files and compiler directives. 



Copyright (C) 1998 

Video Mouse Group Partnership 



10 



Module : dspcode.c 



/* 
*/ 



: MODIFICATION HISTORY = 



20 



* Include files 



= MODULE ENVIRONMENT^ 



25 



30 



35 



#include "dddefs.h" 

^include "ddptypes.h" 

#include "dberrors.h" 

#include "xpgreg.h" 

#include "grabisr.h" 

#include "intpt40.h" 



/* XPG definitions and prototypes. */ 

/* XPG function prototypes. */ 

/* XPG enor codes. */ 
/* XPG register definitions */ 
/* grab isr include file */ 

/* Parallel Runtime Support Library intertuupts header */ 



^include "protocol.h" /* Protocol constants defined for this application */ 



* Function prototypes. 
*/ 



extern VOID DBU_App 1 Inteniipt (VOID); 
extern VOID DDK_PKTDelay (ULONG); 



40 



/**= 
/* 



Global Variables 
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volatile LONG 



G_lApplUserMaskInlCount = 0; 

CODE = 



* Name 



DBU Correlation 



* Parameters : 

llnputlmagcl Image number of the 1st input image. 
10 * llnputlmage2 Image number of the 2nd input image. 

* Returns Error status 

* , P_SUCCESS • 

15 * IMotionX 
IMotionY 

* IFrameRate 



/* FFT STUFF FROM TI */ 

^define FFTSIZE 64 /* FFT size (n) 

25 #define LOGFFTSIZE 6 /* iog(FFT size) 
#defme FFTSIZEx2 128 

#define BLOCKO Ox002ff800 /* on-chip RAM buffer 



extern void cfftc(); 



/* C-callable complex FFT */ 



PCT/US99/00086 



*/ 
*/ 



==*/ 



float complexMatrixl[FFTSTZE][FFTSIZEx2]; /* Input matrix 
float complexMatrix2[FFTSIZE][FFTSIZEx2]; /* Input matrix 
float correlationMatrix[FFTSIZE][FFTSIZEx2]; 
long previousFrame[FFTSIZE][FFTSIZE]; 
35 float *p_]ocalRam *fPtrl,*fPtr2; 
float correlatioiiPeak; 

float *blockO = (float *)BLOCK0, *mml [FFTSIZE], *mm2[FFTSIZE], *mm3[FFTSIZE]; 



40 



45 



/*End FFT Stuffs/ 

LONG DBU_UserFunction (LONG llnputlmagel, LONG Ilnputlmage2) 

{ 

PROCESSfNGINFO Imagelnfol; 
PROCESSINGINFO ImageInfo2; 
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LONG lErrorStatus; 

LONG IFifoStatus; 

LONG lOutputFifoStatus; 
5 LONG IFrameCount = 0; 

LONG IStatus; 

register LONG *plAddressl; 

register LONG *plAddress2; 

register LONG *p_image Address; 
10 long *lPtr; 

float *currentDifference; 

float *previousDifference; 

ULONG ulPCData; 

15 ULONG ulTempO; 

ULONG ulValue; 

ULONG pulPaclcet[5]; 

long peakRow; 
20 long peakCol; 

long row; 

long col; 

long pixel; 

long index 1; 
25 long index2; 

long index3; 



int targetCol= 16,targetRow = 1 6,targetDirection=0; 

30 

/* */ 

/* Initialize some variables. */ 
/* — --*/ 

35 lErrorStatus = P_SUCCESS; 

ulPCData - APPLICATION_RUNNING; 
IFifoStatus = P_EMPTY; 

/* - - */ 

40 /* Install the ISR found in the file INTERRUP. ASM to interrupt */ 
/* resource IIOFO. Initialize the GJ ApplUserMasklntCount. */ 
/* - - — */ 



DDFJSRSetllOFO (P_INTERRUPT_USER_MASK, (VOID *) 
45 DBU_App I Interrupt); 

G_lApplUserMasklntCount = 0; 
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/* */ 

/* Set the ID of the second image for double buffering. */ 
/* Perform a quick grab setup on the input image number. */ 
/* Call Quick Grab with no wait. This application runs the grab */ 
5 /* in continuous mode, the grab will not return until the DMA */ 
has started. */ 



/*- */ 

if {(IStatus = DBF_SetSecondlmageID {P_DEFAULT_QGS, lhiputlmage2)) 
10 != P_SUCCESS) 

/ 

ulPCData = END_APPLICATION_REQUEST; 
lErrorStatus = IStatus; 
} /* End if. */ 

15 



if ((IStatus = DBF_QuickGrabSetup (P_DEFAULT_QGS, llnputlmagel)) 

!= P_SUCCESS) 

I 

UlPCData - ENT)_APPLICATION_REQUEST; 
20 lErrorStatus = IStatus; 

} /* End if. */ 

if ((IStatus = DBF_QuickGrab (P_DEFAULT_QGS, P_GRAB_INIT, 
P_GRAB_NO_WAIT)) 
25 !=P_SUCCESS) 
{ 

ulPCData = END_APPL1CAT10N_REQUEST; 
lErrorStatus = IStatus; 
} /* End if */ 

30 



/* */ 

/* Enable hardware interrupts on IIOFO */ 
/* _*/ 

35 rNT_ENABLE (); 

set_iif_flag (IIOF0_EIIOF); 

/* ^ */ 

/* Initialize pointers to the two image buffers */ 
40 /* — - */ 

if ((IStatus = DBKJvImtGetlmagelnfo (llnputlmagel, &ImageInfol)) 

!= P_SUCCESS) 

{ 

45 ulPCData - END_APPLICATION_REQUEST; 

lErrorStatus = IStatus; 
} /* End if */ 
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if ((IStatus = DBK_MmtGetImageInfo (llnputimage2, &ImageInfo2)) 

!= P_SUCCESS) 

{ 

5 ulPCData = END_APPLICATION_REQUEST; 

lErrorStatus = IStatus; 
} /* End if. */ 

pIAddressl = (LONG *) (Imagelnfo I. PRO_Mapped Address); 
10 plAddress2 = (LONG *) (ImageInfo2.PRO_MappedAddress); 



/* FFT Initialization Stuff*/ 

asm(" or 1800h,st"); /* cache enable */ 

/*End FFT Initialization */ 

15 

/* */ 

/* While the input fifo from the PC does not contain */ 
/* any data, continue processing frames and returning */ 
20 /* the results to the PC. */ 

/* */ 



while (ulPCData != END_APPLIC ATION_R£QUEST) { 
25 if (IFrameCount < (G_! App 1 UserMasklntCount)) { 

IFrameCount = G_lAppl UserMasklntCount; 

pjmageAddress = (G_lAppl UserMasklntCount % 2) 
? pIAddressl : plAddress2; 
30 if (G J App 1 UserMasklntCount % 2){ 

p_imageAddress=plAddress2;*/ 
currentDifference=&complexMatrix2[0][0]; 
previousDi fference=&complexMatrix 1 [0] [0] ; 

} 

35 else{ 

p_imageAddress=pl Address 1 ;*/ 
currentDifference=&complexMatrix 1 [0][0] ; 
previousDifference-&complexMatrix2[0][0]; 

• } 



40 



Compute FFT on Difference Frame Rows *************/ 



for (row-0;row<FFTSIZE;row-f+) { 

for (col=0;col<FFTSIZE;col++){ 
45 lPtr^_imageAddress+2*col+256*row; 

pixel=*lPtr+*(lPtr+l)-f*(lPtr+l28)+*(lPtr+129); 
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/*Compute Difference Frame */ 
block0[2*col]=(float) (pixel - 

previousFrame[row][col]); 

block0[2*col+l]=0.0; 

/*Save current frame for next iteration*/ 

prcviousFrame[row][col]=pixel; 

} 

cfftc(biockO,FFTSIZE,LOGFFTSrZE); 

fPtrl =currentDifference+row*FFTSIZEx2; 
for (indexl=0;indexl<FFTSIZEx2;indexl++) 
fPtr 1 [index 1 ]=blockO[index 1 ]; 



for (col=0;col<FFTSIZE;co!++) { 
index3=2*col; 

for (mdex2=0;index2<FFTSIZEx2;index2=index2+2) { 
blockO[iadex2]=cuiTentDifference[index3]; 
block0[index2+ 1 ]=cunrentDifference[index3+ 1 ]; 
index3+=FFTSIZEx2; 

} 

/^Complete column FFT of difference frame*/ 
c fftc(b lockO,FFTSI ZE , LOGFFTS IZE) ; 



index3=2*co!; 

for(index2=0;index2<FFTSIZEx2;index2=index2+2){ 

/*Save FFT of difference frame */ 
currentDifference[index3]=biockO[index2]; 
currentDifference[index3+l ]=block0[index2+ 1 ]; 

blockO[index2]=currentDifference[index3] 
*previousDifference[index 1 ] 
+currentDi fference[index3+ 1 ] 
*previousDifference[index 1 + 1 ]; 

blockO[index2+l]=currentDifference[index3] 
*previousDifference[index 1 + 1 ] 
-cuiTentDifference[index3+l] 
*previousDifference[indexl]; 

index3+=FFTSIZEx2; 

} 

/^Compute inverse FFT for correlation matrix column*/ 
cfftc(bIockO,FFTSIZE,LOGFFTSIZE); 

/*Save to correlation frame*/ 
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fPtri=&coiTelalionMatrix[0][0]; 
index3=2*col; 

for (index2=0;index2<FFTSIZEx2;index2=indcx2+2) { 
5 fPtrl[index3]=blockO[index2]; 

fPtrl[mdex3+l]-blockO[index2+l]; 
index3+=FFTSlZEx2; 

} 

f 

\ 0 correlatioiiPeak==- 1 00000000; 

for (row=0;row<FFTSlZE;row4-f-) { 

fPtrl=&conelationMatrix[row][0]; 
for (index 1 =0;index 1 <FFTSIZEx2;index 1 ++) 
blockO[index 1 ]=fPtrl [index 1 ]; 
15 /* Inverse FFT on con-elation matrix row */ 

cfftc(block0,FFTSIZE,LOGFFTSIZE); 

fPtrl=&correIationMatrix[row][0]; 
for (col=0;co[<FFTSIZE;col++){ 
20 index l=2*col; 

if (con-elationPeak<blockO[index 1 ]) { 

correlationPeak=blockO[index 1 ]; 

peakRow=row; 

peakCol-indexl; 

25 } 

fPtrl [index 1 ]=blockO[index 1 ] ; 



30 

/* */ 

/* Send protocol, lAverage and IFrameCount to the PC. */ 
/* */ 

35 pulPacket[0] = APPLICATION_RUNNING; 

pulPacket[l] = peakCol; 
puIPacket[2] peakRow; 
pulPacket[3] = (long) (correlationPeak*.000 1 ); 

DDK_PKTSend (P_PACK^T_USER_1NTERFACE, 

40 pulPacket, 

4 * sizeof (LONG),P_WAITFORCOMPLETE, 
P_PC_INTERRUPT); 
} /* End if */ 

45 DDK_PKTInterfaceStatus (P_PACK£T_USER_INTERFACE, &lFifoStatus, 

&lOutputFifoStatus); 
if(IFifoStatus !=P_EMPTY){ 
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DDK_PKTRecv (P_PACKET_USER_INTERFACE, 

pulPacket, 



4 * sizeof {LONG),P_WAITFORCOMPLETE, 
P_NO_PC_INTERRUPT); 
5 ulPCData = puIPacket[0]; 

} /* End if. */ 
} /* End while. */ 

/* */ 

10 /* Disable the IIOFO interrupt. */ 
/* */ 

reset_iif_flag (IIOF0_EnOF); 
DDFJSRDisablellOFO (); 

15 

/* */ 

/* Abort the continuous grab. */ 
*/ 

20 DBF AbortGrab (); 

DBF_QuickGrabStatus (P_GRAB_WAIT); 

/* */ 

/* Send back a protocol word indicating the */ 
25 /* last packet of data. (Pad to correct size) */ 
r- — - */ 



ulValue = APPLICATION_TERMrNATED; 
DDK_PKTSend (P_PACKJET_USER_INTERFACE, &ulValue, IL, 
30 P_WAITFORCOMPLETE, 



P_NO_PC_rNTERRUPT); 
DDK_PKTSend {P_PACKET_USER_rNTERPACE, &lErrorStatus, IL, 

P_WAITFORCOMPLETE, P_NO_PC_rNTERRUPT); 
DDK_PKTSend (P_PACKET_USER_INTERFACE, &lErrorStatus, IL, 
35 P_WAITFORCOMPLETE,P_PC_INTERRUPT); 

/* */ 

/* Empty USER input FIFO and output FIFO. */ 
/* (Host won't get the END message until */ 
40 /* the output FIFO is empty !) */ 
/* -- - */ 



DDK^PKTInterfaceStatus {P_PACKET_USER_INTERFACE, &lFifoSlatus, 
&10utputFifoStatus); 

45 

while ((IFifoStatus != P_EMPTY) || (lOutputFifoStatus !-P_EMPTY)){ 
if(lFifoStatus !=P_EMPTY){ 
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DDK_PKTRecv (P_PACKET_USER_rNTERFACE, 

&ulPCData. IL, 

P WAITFORCOMPLETE, P_NO_PC_INTERRlJPT); 

} /* End if. */ 

DDK_PKTInterfaceStatLis (P_PACKET_USER_1NTERFACE, &lFifoStatus, 
&10utputFifoStatus); 

} /* End while. */ 
return P_SUCCESS; 
[ /* End of the DBU_UserFunction function. */ 



15 /* 



20 /* 



Copyright (C) 1998 

Video Mouse Group Partnership 



25 



*/ 

// videomouseDlg.cpp : implementation file 

// 

#include "stdafx.h" 
#include "videomouse.h" 
#include "videomouseDlg.h" 

30 #include "dpdefs.h" /* XPG definitions and prototypes. */ 

#include "dpplypes.h" /* XPG function prototypes. */ 

#include "dben-ors.h" /* XPG error codes. */ 

#include "protocohh" /* Protocol constants define for this application */ 

^include <conio.h> /* getch */ 

35 #include <math.h> 

static int s_runDSPLoopThreadProc; 

#ifdef_DEBUG 
40 Mefine new DEBUG_NEW 
#undefTHIS_FlLE 
static char THIS_FILE[] = _FILE_; 
#endif 

. 45 #define P JD_USER_FUNCTION 1 OL 

#defineP PCOUNT USER_FUNCTION 1 2 
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^define FRAMESIZE 64 

UINT DSPLoopProc(LPVOID pclass) 



CVideomouseDlg& pcdd = *(reinterpret_cast<CVideomouseDlg*>(pclass)); 
CSrring dataString; 

int savedCommandMode = DPK_XCCSetCommandType (P_USER); 
DPK_XCCSetWaitMode (P_NO_WAIT); 

DPK_XCCPushOpcode (P_ID_USER_FUNCTI0N1, 
P_PC0UNT_USER_FUNCTI0N1); 

DPK_XCCPushLong ((unsigned long) pcdd.m_inputImageNumber2); /* 2 */ 

DPK_XCCPushLong ((unsigned long) pcdd.mJnpiitlmageNumberl); /* 1 */ 

DPK_XCCSetCommandType (savedCommandMode); 

long status = DPK^XCCCheckStatus (P_PCOUT, P_XCCFrX); 

double franiesPerSecond; 

time_t start, finish; 

time( &start ); 

int countei^O; 

if(status-=P_SUCCESS){ 

while(s_i-unDSPLoopThreadProc) { 
counter++; 

status = DPK_PKTRecv (P_PACKET_USER INTERFACE (H VOID 

*) 

pcdd.m_DSPPacket, 4 * sizeof (LONG), 
P_WAIT_COMPLETE); 

long protocol = pcdd.m_DSPPacket[0]; 
dataString.Format("%d",pcdd.m_DSPPacket[l]); 
pcdd.m_average.SetWindowText(dataString); 
dataString.Format("%d",pcdd.m_DSPPacket[2]); 
pcdd.m_frameNumber.SetWindowText(dataString); 
time( &finish ); 

double elapsedTime = difftime( finish, start ); 
framesPerSecond=(double) counter/ (double)elapsedTime; 
long max = pcdd.m_DSPPacket[3]; 

if(max>0.0){ 

double detectx=(double) pcdd.m_DSPPacket[2]; 
double detecly=0.5*(double)pcdd.m_DSPPacket[ 1 ] ; 

if (detectx > 3 1 ) detectx =detectx-FRAMESIZE; 
ir(detecty > 31) detecty = FRAMESIZE-detecty; 
else detecty = -detecty; 
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// double multiplier;/ 

// if (detectx<2) multipliei^l.O; 

// else ir(detectx<10) multiplier = exp((detectx-2.0)/2.5); 

// else multiplier=0; 

5 . 

// detectx^^^TTiultiplier; 

// if (detecty<2) multiplier^l.O; 

// else if (detecty<10) multiplier = exp((detecty-2.0)/2.5); 

// else multiplier=0; 

10 // detecty*=multiplier; 

static POINT ptCursor; 
GetCursorPos((S:ptCursor); 

15 ptCursor.x-=(long) detectx; 

ptCursor.y-=(long) detecty; 

SetCursorPos(ptCursor.x,ptCursor.y); 

} 

20 } 

savedCommaudMode = DPK_XCCSetCommandType 



(savedCommandMode); 

DPK_XCCSetWaitMode {P_WAIT_COMPLETE); 
DPK^EndPCK (); 
25 AfxMessageBox(" Exited Thread"); 

} 

else{ 

s_runDSPLoopThreadProc=false; 
savedCommandMode = DPK_XCCSetCommandType 
30 (savedCommandMode); 

DPK^XCCSetWaitMode {P_WAIT_COMPLETE); 
DPK_EndPCK (); 
AfxMessageBox("Exited Thread"); 

} 

35 time( &fmish ); 

double elapsedTime = difftime( finish, start ); 
framesPerSecond=(double) counter/ (double)elapsedTime; 
return 1; 

} 

40 

///////////////////////////////////////////////////////////////////////////// 

// CAboutDlg dialog used for App About 

class CAboutDlg : public CDialog 
45 { 

public: 

CAboutDigO; 
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// Dialog Data 

//{ {APX_DATA(CAboutDlg) 
enum { IDD = IDD_ABOUTBOX }; 
5 //}}AFX_DATA 

// ClassWizard generated virtual function overrides 

//{ {AFX_VIRTUAL(CAboutDlg) 

protected: 

10 virtual void DoDataExchaage(CDataExchange* pDX); // DDX/DDV support 

//}}AFX_VIRTUAL 

// implementation 
protected: 

15 • // { { AFX_MSG(C AboutDIg) 
//) ) AFX_MSG 

DECLARE_MESSAGE_MAP() 



20 C AboutDIg: :CAboutDlg() : CDialog(CAboutDlg::IDD) 
{ 

//{ {AFX_D AT A_rNIT(C AboutDIg) 
//}}AFX_DATA_fNIT 

\ 

25 

void C AboutDIg: :DoDataExchange(CDataExchange* pDX) 

CDialog::DoDataExchange(pDX); 
// { { AFX_DATA_M[AP(CAboutDlg) 
30 //} } AFX_DATA_M AP 

} 

BEGIN_MESSAGE_MAP{CAboutDlg, CDialog) 

//{ {AFX_MSG_MAP(CAboutDlg) 
35 // No message handlers 

//}}APX_MSG_MAP 
END_MESSAGE_MAP() 

IfllJIIIIIIIIIf/II/ll/lllllllllfflllllUIIIIIII/llflllllllllfll/IIIIUII/llf/ 
40 // CVideomouseDlg dialog 

CVideomouseDig::CVideomouseDlg(CWnd* pParent /*=NULL*/) 
: CDialog(CVideomouseDlg::IDD, pParent) 

{ 

45 //{ { AFX_DATA_rNIT{CVideomouseDlg) 

//}}AFX_DATA_rNIT 

// Note that Loadlcon does not require a subsequent Destroylcon in Win32 
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m_hlcon = AfxGetApp()->LoadTcon(IDR_MAlNFRAME); 

} 

void CVideomouseDlg::DoDataExchange(CDataExchange* pDX) 
5 { 

CDialog::DoDataExchange(pDX); 
//{ {AFX_DATA_MAP(CVideomouseDlg) 
DDX_Control(pDX, IDC_FRAMENUMBER, m_frameNumber); 
DDX_Control(pDX, IDC_AVERAGE, m_average); 
10 //} } AFX_D ATA_M AP 

\ 

BEGIN_MESSAGE_MAP(CVideomouseDlg, CDialog) 
//{ {AFX_MSG_MAP(CVideomouseDlg) 
1 5 0N_ WM_S YSCOMM AND() 

ON_WM_PAINT() 
ON_WM_QUERYDRAGICON() 
ON_BN_CLICKED(IDC_ENABLE, OnEnable) 
0N_BN_CLICK:ED(IDC_ST0P, OnStop) 
20 //}}AFX_MSG_,MAP 
END_MESSAGE_MAP() 

l!I!ll(/llil/niilllllllll/IIIIIIIIUIIIII}lll(lllllll/fllllIlllllllllll/llll 
// CVideomouseDlg message handlers 

25 

BOOL CVideomouseDlg: :OnInitDialog() 
{ 

CDialog::OnInitDialog(); 

30 // Add "About..." menu item to system menu. 

// IDM_ABOUTBOX must be in the system command range. 
ASSERT((IDM_AB0UTB0X & OxFFFO) = IDM^ABOUTBOX); 
ASSERT(lDM_ABOUTBOX < OxFOOO); 

35 

CMenu* pSysMenu = GetSystemMenu(FALSE); 
if(pSysMenu !=NULL) 

{■ 

CString strAboutMenu; 
40 strAboutlVIenu.LoadString(IDS_ABOUTBOX); 
if (!strAboutMenu.IsEmpty()) 

{ 

pSysMenu->AppendMenu(MF_SEPARATOR); 
pSysMenu->AppendMenu(MF_STRING, IDM_ABOUTBOX, 

45 StrAboutMenu); 

} 

} 
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// Set the icon for this dialog. The framework does this automatically 
// when the application's main window is not a dialog 
SetIcon(niJiIcon, TRUE); // Set big icon 

Seticon(m_hIcon, FALSE); // Set small icon 

InitiaHzeFrameGrabberO; 

return TRUE; // return TRUE unless you set the focus to a control 



void CVideomouseDlg::OnSysCommand(UINT nID, LP ARAM IParam) 

if ((nID & OxFFFO) = IDM_ABOUTBOX) 
{ 

CAboutDlg dlgAbout; 
dlgAbout.DoModalO; 

/ 

else 

.( 
1 

CDialog:;OnSysConimand(nID, IParam)- 

} 

I 

// If you add a minimize button to your dialog, you will need the code below 
// to draw the icon. For MFC applications using the documem/view model, 
// this is automatically done for you by the framework. 

void CVideomouscDlg::OiiPaint() 
{ 

if (IslconicO) 
{ 

CPaintDC dc(this); // device context for painting 

SendMessage(WM_ICONERASEBKGND, (WPARAM) dcGetSafeHdcQ, 0); 

// Center icon in client rectangle 
int cxicon = GetSystemMetrics(SM_CXICON); 
int cylcon = GetSystemMetrics(SM_CYICON); 
CRect rect; 

GclChentRect(&rect); 

int X = (rect.WidthO - cxicon + 1) / 2; 

int y = (rect.HeightO - cylcon + 1 ) / 2; 

// Draw the icon 
dc.DrawIcon(x, y, m_hIcon); 

} 
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else 
{ 

CDialog::OnPaint(); 

} 

5 } 

// The system calls this to obtain the cursor to display while the user drags 
// the minimized window. 

HCURSOR C VideomouseDlg: :OnQueryDragIcon() 
10 1 

return (HCURSOR) m Jilcon; 



void C VideomouseDlg: :OnEnable() 
15 { 

s_runDSPLoopThreadProc = true; 

m_pDSPLoopThread = AfxBeginThread (DSPLoopProc, this); 



void CVideomouseDlg:;OnStop() 
{ 

if(s__runDSPLoopThreadProc){ 
25 m_DSPPacket[0] - END_APPL[CATION_REQUEST; 

DPK_PKTSend (P_PACK£T_USER_INTERFACE, m_DSPPacket,4 * sizeof 
(LONG), 

P_WAIT_COMPLETE); 
s_runDSPLoopThreadProc=false; 

30 } 
I 

#defme INIT_FAILURE 1 

long CVideomouseDig::lnitializeFrameGrabber() 

/ 

35 DPK_[mtPCK(l); 

if ((ni_status=DPK_InitXPG (0, P_IFB_RELOAD_COFF_FILE | 

PJFB_CHECK_REVISION,"videoMouse.out")) !- P_SUCCESS){ 
DPK_EndPCK (); 

40 m_errorMessage.Format("Error initializing FPG, status = %ld.\n", m_status); 

AfxMessageBox(m_errorMessage); 
return 1NIT_FAILURE; 

} 

DPK_XCCSetWaitMode (P_WAIT_COMPLETE); 
45 long cpsNumber = DPF_LoaLdCPF("vidmouse.cpf '); 

if (cpsNumber != P_SUCCESS){ 

m_en'orMessage.Fonnat("Error loading a CPF, status = %ld.\n", m_status); 
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AfxMessageBox(m_erTorMessage); 
DPK_EndPCK (); 
return INIT_FAILURE; 

} 

5 i f ((m_status = DBF_SelectCPS (cpsNumber)) < P_SUCCESS) { 

m_errorMessage.Fomiat("Error selecting a CPS. status = %ld.Vn", m_status); 
AfxMessageBox(m_errorMessage); 
DPK^EndPCK (); 
return INIT_FAILURE; 

10 } 

m_status=DBF_SetGrabWindow(P_DEFAULT_QGS, 256,128,176,128); 

if ((m status = DBF_GetGrab Window (P_DEFAULT_QGS, &ni_startCol, 
15 &m_numCols, 

&m_startRow, &m_numRows)) != P_SUCCESS){ 

m_errorMessage.Fomiat("Enror DBF_GetGrabWindow: %ld.\iV', m_status); 

AfxMessageBox{m_enrorMessage); 

DPK_EndPCK (); 

20 return INIT_FAILURE; 

> 

if ((ni_status = DBK_MmtCreateImage (m_numCols, m_numRows, 
P_DATA_SIZE_BYTE, 
25 P_D ATA_TYPE_INTEGER, 2, &m_inputImageN umber 1 , 

&:ni_numberImagesCreated)) 

!=P_SUCCESS){ 

m_errorMessage.Format(" Error creating the input image; status = %ld.Vn", 

m_status); 

30 AfxMessageBox(m_errorMessage); 
DPK_EndPCK (); 
return INIT_FAILURE; 

} 

m_inputImageNumber2 = m_inputlmageNumberl + 1; 
return 0; 

} 



40 It should be understood that the following claims are to cover all generic and 

specific features of the invention described herein, and all statements of the scope of 
the invention which, as a matter of language, might be said to fall there between. 



Having described the invention, what is claimed is: 
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1. A human motion following controller for augmenting motion of items shown 
on a computer display, the display being coupled to a computer of the type which 
controls positioning of the items through operating system controls, comprising: 

5 a camera for capturing frames of data corresponding to a first image of at least part 
of a user at the computer display; 

signal processing means coupled to the camera for (a) detecting differences between 
successive frames of data corresponding to motion of the first image, and (b) 
10 communicating differences information to the computer to reposition display of the 
items through operating system controls, the items being repositioned on the display 
by an amount corresponding to the motion of first image. 

2. A controller of claim 1, wherein the items comprise a computer cursor. 

15 

3. A controller of claim 1, wherein the items comprise a scene view. 

4. A controller of claim I, further comprising a PC card for. installation within 
the computer and for communication on a computer bus, the signal processing 

20 means being substantially resident with the PC card for communicating differences 
information to the bus. 

5- A controller of claim 1, wherein the camera comprises means for capturing 
augmented frames of data corresponding to a second image of part of the user at the 

25 computer display, the signal processing means further comprising means for 
detecting differences between successive augmented frames of data corresponding 
to motion of the second image and for communicating augmented difference 
information to the computer to reposition display of the items through operating 
system controls, the items being repositioned on the display by an amount 

30 corresponding to motion of the first and second images. 
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6. A controller of claim 1, further comprising frame difference electronics for 
storing and subsequently subtracting pixel-by-pixel difference data. 

7. A controller of claim 6, wherein the difference electronics comprise multiple 
5 frame memory, a subtraction circuit, and a state machine controller/ memory 

addresser to control data flow. 

8. A controller of claim 1, further comprising N frame video memory for storing 
frames of image data. 

10 

9. A controller of claim 1, further comprising a DSP for implementing select 
algorithms on difference frames or raw frames of image data. 

10. A controller of claim 9, further comprising memory selected from the group 
15 of EPROMand RAM. 

11. A controller of claim 9, further comprising means for interfacing the DSP to a 
PCI bus in the computer. 

20 12. A controller of claim 1, further comprising MPEG compression electronics for 
compressing video for the computer. 

13. A controller of claim 1, wherein the signal processing means comprises frame 
differencing means for removing unchanged information from image frames. 

14. A controller of claim 1, wherein the signal processing means comprises frame 
memory to buffer one or more image frames. 

15. A controller of claim 14, wherein the signal processing means comprises a 
30 frame differencer for reading a delayed frame from the frame memory and for 

subtracting the delayed frame from a current image frame. 
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16. A controller of claim 1, wherein the signal processing means comprises 
correlation means for determining row and column shifts corresponding to 
differences between a current image frame and a delayed image frame. 

5 17. A controller of claim 16, wherein the signal processing means comprises best 
fit algorithm means for minimizing the shifts to provide alignment. 

18. A controller of claim 17, wherein the best fit algorithm means utilizes a peak 
detect algorithm. 

10 

19. A controller of claim 1, wherein the signal processing means comprises video 
cursor control for enabling and alternatively disabling cursor control. 

20. A controller of claim 19, wherein the video cursor control comprises means 
1 5 responsive to keystrokes at the computer. 

21. A controller of claim 19, wherein the video cursor control comprises means 
responsive to a blink of an eye of the user. 

20 22. A controller of claim 19, wherein the video cursor control comprises means 
responsive to sound generated by the user. 

23. A controller of claim 22, further comprising a microphone to detect the sound. 

25 24- A controller of claim 1, wherein the signal processing means comprises a 
complex multiplier for providing a tw^o dimensional inverse FFT operation. 

25. A controller of claim 24, wherein the signal processing means comprises a 
peak detect for determining a shift associated with aligning difference images. 

30 
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26. A controller of claim 1, wherein the signal processing means comprises FFT 
means for providing a two dimensional FFT of image data. 

27. A controller of claim 1, wherein the signal processing means comprises means 
5 for identifying parts of the user, the parts being selected from the group of a hand, 

elbow, head, neck, ears, and forehead. 

28. A controller of claim 1, wherein the signal processing means comprises means 
for detecting left and right movement of a head of the user and for shifting the items 

10 in response to the left and right movement. 

29. A controller of claim 1, wherein the signal processing means comprises means 
for detecting rotational movement of a head of the user and for rotating the items in 
response to the left and right movement. 

15 

29. A controller of claim 1, wherein the signal processing means comprises means 
for repositioning the items, if appropriate, at approximately every 1/30'^ of a second. 

30. A controller of claim 1, wherein the signal processing means comprises means 
20 for repositioning the items at a selected magnification as compared to actual 

movement of the user. 

31. A controller of claim 1, wherein the signal processing means comprises means 
for storing image data of a head of the user at various orientations relative to the 

25 camera and for correlating image data to the stored image data to define head 
orientation, the head orientation being used to reposition the items. 

32. A controller of claim 1, wherein the signal processing means comprises IR 
means for detecting heat associated with the user and for repositioning the items at a 

30 rate correlated to the heat. 
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33. A controller of claim 32, wherein the items correspond to computer gaming 
display images. 



34. A controller of claim 32, wherein the heat corresponds to user stress. 

5 

35. A controller of claim I, further comprising at least one other camera arranged 
to take images of at least a second part of the user. 

36. A controller of claim 35, wherein the one other camera takes image data in a 
10 second electromagnetic spectrum. 

37. A controller of claim 1, wherein the camera comprises a DSP, 

38. A controller of claim 37, wherein the DSP processes difference information for 
15 the computer. 

39. A controller of claim 1, wherein the signal processing means comprises a CPU 
within the computer. 

20 40. A controller of claim 1, wherein the signal processing means comprises means 
for processing multiple image zones in frames of image data and for repositioning 
the items according to characteristics between zones. 

41. A controller of claim 40, wherein one zone comprises image data 
25 corresponding to at least one eye of the user. 

42. A controller of claim 41, wherein the signal processing means comprises 
means for determining a blink of the eye. 

30 43. A controller of claim 42, wherein the signal processing means comprises 
means for disabling and alternatively enabling cursor control based upon the blink. 
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44. A conh-oller of claim 1, wherein the signal processing means comprises means 
for processing image data to determine motion of at least one eye of the user and for 
repositioning the items based upon the motion. 

45. A controller of claim 1, wherein the signal processing means comprises means 
for processing image data to determine motion of a pupil of at least one eye of the 
user and for repositioning the items based upon the motion. 

46. A controller of claim 1, wherein the camera comprises a zoom attachment for 
automatically zooming into a desired magnification of at least one eye of the user. 

47. A controller of claim 1, wherein the camera comprises zoom means for 
automatically focusing on the user as the user moves in distance from the camera. 

48. A controller of claim 47, wherein the signal processing means comprises 
means for enlarging or shrinking the items on the display in response to focusing by 
the zoom means. 

49. A controller of claim 1, wherein the signal processing means comprises means 
for determining edges of a head of the user and for repositioning the items in 
response to movements of the edges. 

50. A controller of claim 1, wherein the signal processing means comprises means 
for isolating one or more objects held by the user and for repositioning the items in 
response to movement of the objects. 

51. A controller of claim 1, wherein the signal processing means comprises means 
for isolating one or more parts of the user and for repositioning the items in response 
to movement of the parts. 
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52. A controller of claim 1, wherein the parts comprise at least one of a hand, 
head, and a foot. 

53. A controller of claim 1, wherein the signal processing means comprises means 
5 for isolating one or more symbols associated with the user and for repositioning the 

items in response to movement of the symbols. 

54. A controller of claim 1, further comprising a second camera consti'ucted and 
arranged for viewing the user from above, the signal processing means having 

10 means for repositioning the items in response to movement detected from images in 
the second camera. 

55. A controller of claim 54, wherein signal processing means comprises means 
for repositioning the items in response to forward and backward movement of the 

15 user as detected by the second camera. 

56. A controller of claim 1, further comprising re-calibration means connected 
with the signal processing means for repositioning the items to an original position 
in response to a re-calibration event 

20 

57. A controller of claim 1, wherein the re-calibration means comprises a 
microphone and the event comprises a sound generated by the user. 

58. A controller of claim 1, wherein the camera comprises the re-calibration 
25 means for detecting a blink of the user. 

59. A system for controlling a computer, comprising: 

a transducer for converting optical signals to electrical signals; 
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signal processor means for detecting motion in the digital data and providing a 
digital representation of said motion; 

communication means for entering one or more of the electronic signals, digital data, 
and digital representation into the computer to manipulate a computer display in 
response to the motion. 

60. A system of claim 59, wherein the computer comprises the signal processor 
means. 

61 . A system of claim 59, wherein said signal processor comprises a digital signal 
processor separate from a CPU within the computer. 

62. A system of claim 59, wherein the transducer, electronic means and signal 
processor are constructed and arranged into a single device in communication with 
the computer. 

63. A system of claim 59, wherein the communication means comprises one of 
RS170 video, a PCI bus interface, a digital computer interface, a serial computer 
interface. 

64. , A system of claim 59, further comprising means for repositioning a computer 
cursor in response to the motion. 

65. A system of claim 59, wherein the transducer comprises one or more of a 
visible CCD camera and an IR camera. 



66 



wo 99/35633 PCTAJS99/00086 

66. A system of claim 59, wherein the transducer comprises a CCD camera 
having at least 2x2 imaging pixels, 

57 j\ system of claim 66, wherein the camera comprises optics with various fields 
of view. 

5 68. A system of claim 59, wherein the transducer comprises one of a CCD or a 
CMOS integrated circuit with digital outputs. 

69. A system of claim 68, wherein the transducer generates RS170 ouput. 

70. A system of claim 68, wherein the transducer generates RS170 digital output. 

71. A system of claim 68, wherein the transducer generates digital resolutions of 4 
10 bits or greater 

72. A system of claim 59, wherein the signal processor comprises a video frame 
memory. 

73. A system of claim 59, wherein the signal processor comprises frame 
difference functionality. 

15 74. A system of claim 59, wherein the signal processor comprises video frame 
difference memory. 

75. A system of claim 59, wherein the signal processor comprises correlation 
functionality. 

76. A system of claim 59, wherein the signal processor comprises means for 
20 determining best fit motion. 
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77. A system of claim 59, further comprising means for controlling cursor 
movement. 



78. A system of claim 59, further comprising means for segmenting video images 
to provide multiple digital representations of the motion corresponding to different 
portions of the digital representation. 

79. A system of claim 78, v^herein the optical signals are generated through image 
acquisition of a portion of a human. 

80. A system of claim 78, wherein the optical signals are generated by viewing 
multiple features of a human. 

81. A system of claim 59, further comprising neural net means for learning user 
motion over time. 
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1 . With regard to the elements of the international application: * 
[x] the international application as originally filed 



j"^ the description: 


1-58 




, as originally filed 


pages 
pages 


If ONE 




filed with the demand 




I J ONE 


, filed with the letter of 




pages 
I x| the claims: 


59-68 




. as originally filed 


pages 
pages 


mNE 


. as amended (together with any statement) under Article 19 




NONE 




, filed with the demand 


pages 


NONE 


. filed with the letter of 




pages 

[~x| the drawings: . 
pages 


1-20 




, as originally filed 


pages 


NONE 




, filed with the demand 


nacres 


NONE 


. filed with the letter of 




\x\ the sequence listing part of the description: 

n«a.. MONE 


, as originally filed 


pages 


NONE 




. filed with the demand 



pages . 



NONE 



, filed with the letter of _ 



2. With regard to the language, all the elements mariced above were available or furnished to this Authority in the language in which 
the international application was filed, unless otherwise indicated under this item. 

These elements were available or furnished to this Authority in the following language which is: 

I I the language of a translation furnished for the purposes of international search (under Rule 23.1(b)). 
I [ the language of publication of the international application (under Rule 48.3(b)). 

I I the language of tiie translation furnished for the purposes of international preliminary examination (under Rules 55.2 and/ 
or 55.3). 

3. With regard to any rucleotitle and/or amino acid sequence disclosed in the international application, the international 
• preliminary examim tion was carried out on the basis of the sequence listing: 

CZl contained in tht: international application in printed form. 

I I filed together with the international application in computer readable form. 

I I furnished subse :iuently to this Authority in written form. 

I I furnished subsequently to this Authority in computer readable form. 

□ The statement tl^t the subseauently furnished written sequence listing does not go beyond the disclosure in the 
international application as filed has been furnished. 

□ The statement thit the information recorded in computer readable form is identical to the writen sequence listing has 
been furnished. 

4 [x] The amendmerts have resulted in the cancellation of: 



the description, pages_ 
the claims, Nos. 



NONE 



NONE 



\~x\ the draw: ngs, sheetstfift NONE 



5. Q This report has Uen drawn as if (some of) the amendments had not been made, since they have been considered to go 

beyond the disclosure as filed, as indicated in the Supplemental Box (Rule 70.2(c)).* ♦ 
* Replacement sheets which have been JUmished to the receiving Oj^ce in response to an invUation under Article i'^^f^jff^^f.f 
in this report as "originally filed' and are not annexed to this report since they do not coniain amendments (Rules 70. 16 
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V. Reasoned statement under Article 35(2) with regard to novelty, inventive step or industrial 
citations and explanations supporting such statement 


applicability; 


1. statement 






Novelty (N) 


rinims (Please See supplemental sheets 


YES 


rinims (Please See supplemental sheet) 


NO 


Inventive Step (IS) 


riaims (Please See supplemental sheets 


' YES 


r.lflims (Please See supplemental sheets 


NO 


Industrial Applit ability (lA) 


riflims (Please See supplemental sheet) 


YES 


nifiims (Please See supplemental sheet) 


NO 









citations and explanations (Rule 70.7) 

Claims 1-5, 9- 11, 14, 30, 35, 37-39, 50-51, 53-55, 59-62, 64, 68. 72, 77-81 lack novelty under PCT Article 33(2) 
as being anticipated by E*latzker et al. (5,528,263). 

In regards to claim 1 Platzker shows a human motion following controller for augmenting motion of items shown on 
a computer display (figure 1 , item 14 and 24), the display being coupled to a computer of the type which controls positioning 
of items through operating system controls (fig. 1. item 12), comprising: a camera for capturing frames of data corresponding 
to a first image of at leas t part of a user at the computer display (fig. 1 item 28a), signal processing means coupled to the camera 
for (fig. 1, item 12b) (a) detecting differences between successive frames of data corresponding to motion of the first image, 
and (b) communicating differences information to the computer to reposition display of the items through operating system 
controls, the items bein.i. repositioned on the display by an amount corresponding to the motion of first image (column 5, hnes 
10-32). In regards to cLiims 2, 64 and 77 Platzker shows wherein the items comprise a comptiter cursor in response to motion 
(abstract and figure 5a). In regards to claim 3 Platzker shows wherein the items comprise a scene view(rig. 1, item 24). In 
regards to claim 4 Platzker does not show the details of a PC card and a computer bus but examiner contends such features are 
inherent to a computer system as shown in figure 1. In regards to claim 5 Platzker shows the camera comprises means for 
capturing augmented fnmes of data corresponding to a second image of a part of the user at the computer display, the signal 
processing means furthe [■ comprising means for detecting differences between successive augmented frames of data corresponding 
to motion of the second image and for communicating augmented difference information to the computer to reposition display 
of the items through opirating system controls, the items being repositioned on the display by amount corresponding to motion 
of the first and second images (figure 1 and 3a.b,c and column 5. Unes 10-34). In regards to claim 9 Platzker shows a DSP for 
implementing select algorithms on difference frames or raw frames of image data (abstract and figure 3a and 6a). In regards 
to claim 10 Platzker (C^ontinued on Supplemental Sheet.) 
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V. I. REASONED STATEMENTS; „ .5 

The report as to Novelty Nvas positive (YES) with respect to claims 6-8. 12-13. 15-29. 31-34. 36. 40-49 . 52. 56-58. 63. 65 

The report as'tl^Novelty was negative (NO) with respect to claims 1-5, 9-11. 14. 30. 35. 37-39. 50-51. 53-55. 59-62. 64. 68, 
72 77-81 

The report as to Inventive Step was positive (YES) with respect to claims 12. 22-23. 32-34. 36' 57 65-67. 

The report as to Inventive Step was negative (NO) with respect to claims Ml. 13-21, 24-31. 35. 37-56 . 58-64. 68-81. 

The report as to Industria. Applicability was positive (YES) with respect to claims 1-81. 

The report as to Industrial Applicability was negative (NO) with respect to claims NONE. 

V 2 REASONED STATEMENTS - CITATIONS AND EXPLANATIONS (ConUnued): 

does not directly show hi. controller using specific memory type such as EPROM and RAM but e^^miner considers such 
common memory types to be inherent to the normal operation of a personal computer as shown in figure 1 o! PlatzRer. n 
regards to claim 11 it is inherent that the DSP of Platzker is interfaced to the PCI bus in the computer shown in figure L In 
regards to claim 14 Platzter teaches frame memory to buffer one or more image frames (abstract). In regards to claim JU 
Platzker show magnification in a projected image (figure 1). In regards to claim 35. 54. 55 Platter show a second camera 
(figure 1). In regards to .:laim 39 Platzker shows a CPU figure 1. In regards to claims 37 and 38 Platzker does not directly 
state he has a DSP in his camera but he does state he has a CCD and a appropriate signal processing apparatus to the CCD 
therefore examiner conte:yds a DSP is inherent to the camera. In regards to claims 50. 51 and 53 Platzker shows isolatmg one 
or more objects or symbols held by user (figure 3c). In regards to claim 59 Platzker does not directly clami his video camera 
has a transducer for comerting optical signals to electrical signals but examiner contends this feature is inherent in an 
electronic video camera M'hich uses a CCD (column 5. lines 45-57). In regards to claims 60-61 the computer comprises a 
digital signal processor s.:parate from a CPU within the computer (column 5. Unes 10-22). In regards to claim 62 Platzker 
teaches wherein the transducer, electronic means and signal processor are constructed and arranged into a single device in 
communication with the computer (figure 1 items 28a and 12b), In regards to claim 68 Platzker shows the transducer 
comprises one of a CCD or a CMOS integrated circuit with digital outputs (column 5, Hnes 45-57). In regards to claim /i 
Platzker teaches wherein the signal processor comprises a video frame memory (column 5. lines 31-32). In regards to claim 
78-81 Platzker teaches segmenting video images (figures 3b. 3c and 4, column 7. lines 22-29). image acquisition of portion ot 
human hand and multiph; features of a human (figures 1. 3b. 3c and 4) and a neural net means for learmng user motion over 
time (figure 4, column €, lines 40-49). 

Claims 6-8. 13. 15-21. 24-29 . 31, 40-49. 52 . 56, 58 . 63 , 69-71. 73-76 lack an inventive step under PCT Article 
33(3) as being obvious ever Platzker et al. (5,528,263). 

In regards to claims 6-8, 13. 15. Platzker does not direaly show the difference elearonics comprise multiple frame 
memory a subtraction circuit, and state machine controller/memory addresser to control data flow, for storing and 
subsequenUy subtracting pixel-by- pixel difference data, but Platzker does state that he performs these functions and therelore 11 
would have been obvious that he has difference electronics as broadly claimed which perform these functions (abstract and 
figures 6a and 6b, colunn 5. lines 10-34). 

In regards to claims 16-18. 24-26. 73-76 Platzker suggest a signal processor with correlation means, best fit 
algorithm means and a i)eak detect algorithm, difference functionality and a two dimensional inverse FFT operation because 
these are common well known mathematical ways of detecting patterns and figure 3a item 46 suggest this (column 2. lines i> 
53 and column 6, lines :J4-49). 

In regards to .;laims 19 and 20 Platzker does not direcUy show video cursor control being disabled but examiner 
contends that Platzker j ist illustrates a program and hardware connected to a PC and it would be inherent that one has the 
ability to turn program off and go back to convention mouse control and other keystrokes at the computer. 

In regards to Maims 21. 27-29, 31. 40-46. 49. 52 Platzker does not direcUy show controller responsive to a blink of 
a eye movement of head of user but he does show controller responsive to hand movements and examiner contends that the 
Platzker apparatus has .he capability to be also responsive to a blink of a eye and movement of the head and because the 
principle of operation cf Platzker does not change ie detecUon of a body part movement such as hand. This moditication would 
give the user more wa>» to implement more control options or functions and therefore it would have been obvious the 
modification of the Pla:zker apparatus to also detect, movement of the head or eye. 

In regards to claims 47 and 48 Platzker does not directly state that his camera has zoom and automatic focus but 
examiner contends that these features are weU known in the prior art and since they are clearly desirable well known common 
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features on cameras which can make the camera more easy to use. which is motivation and therefore it would have been 
obvious to have these features on the Platzker camera. Note the disclosure does not teach how to i«^Pl^™^*'°7 °' ^^jj 
automatic focus and therefore it was reasonable to believe applicant was in agreement that ^"^f^ ZZZn 

known at the time of his application and therefore it is only a question of was there known motivauon at the time of inveniion 
to use well known camen. features such as zoom and automatic focus in the Platzker camera. 

In regards to ch.ims 56 and 58 Platzker does not directly teach a re-calibration means as broadly chimed but 
examiner centers the positive benefits of calibration are well known in the prior art ard that Platzker wou^d want to keep his 
equipment in calibration to get accurate results which is the motivation to do re-calibraUon as broadly clauned. 

In regards to claims 63 and 69-70 Platzker does not directly state that his cameras transducer g^;;^^^";^ ^ j^^"^^^^^^^ 
RSnO digital output for communication means to the computer interface. Hie examiner contends it would have been obvious 
to use the well known standard camera output such as RS170 because nonstandard outputs would be more costly and require 
more engineering modifications. 

In recards to claim 71 Platzker does not directly state that the transducer generates digital resolution of 4 bits or 
greater btl e^mber con:ends it would have been obvious that the CCD in Platzker would of had at least an 8 bit resolution 
which was the common video camera of the time which corresponds to the very common 256 gray scale. 

Claims 12 22-?.3 , 32-34 , 36. 57. 65-67 meet the criteria set out in PCT Article 33(2)-(4), because the prior art does 
not teach or fairly suggest in regards to claim 12 a MPEG compression electronics, in regards to clanns 22-23 a video cursor 
control means resUf- to sol generated by user, in regards to claim 32-34 a IR means for ^^'-^'^^J^^^^^^^ ^^^^ 
the user and for repositioning the items at a rate correlated to the heat, in regards to claim 36 /^^^^^^^^ . 

image data in a second electromagnetic spearum. in regards to claim 65 the transducer is a ^''f'^''^'^^^^^^ 
regards to claims 66-67 CCD camera has 2x2 imaging pixels, in regards to claim 57 the re-calibrauon means comprises a 
microphone and the event comprises a sound generated by the user. 

- NEW CITATIONS 

US 5.528.263 (PLATZlliER ET AL.) 18 June 1996. abstract, figure 1. 3a, 5a. 6a. 6b, column 1. lines 23-68 
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HUMAN MOTION FOLLOWING COMFvWSmSlAifS^iB^ 

CONTROLLER 

Background 

The primary human-computer interface employed today uses both a keyboard to enter 
textual information and a mouse to provide control over graphical interfaces presented to 
the user. These are the most widely used interfaces for computer applications which 
include word processing, presentation software (i.e. PowerPoint), computer aided design 
packages (i.e. AutoCAD), spreadsheet analysis (Excel) and others. These interfaces are 
also widely used for computer gaming entertainment, though often augmented by or 
replaced by a joystick. 

In the daily use of business applications, the need to control the cursor position on the 
screen requires that the user remove his/her hand from the keyboard in order to use the 
standard mouse. The use of the mouse introduces several issues. In a desk environment, 
the use of the mouse requires maintenance of a free area on the desk. The mouse cord 
must also remain free from obstruction in movement. Additionally, the use of the mouse 
is a major contributing factor for carpal tunnel syndrome. It would be advantageous 
therefore to find an alternative to the mechanical mouse. 



In computer gaming, the complexity of the controls required to operate the game requires 
a combination of either mouse and keyboard or joystick and keyboard. Gaming 
application require the user to control many axis of motion. These can include forward 
motion, reverse motion, left turn, right turn, left strafe (slide), right strafe, upward 
motion, downward motion. Additionally, many games allow the user to look in 
directions different from that in which the vehicle is moving including up, down, left and 
right. These many axes of motion therefore drive the complexity and enjoyment of the 
game. 

It is also desirable to offer alternative approaches to human-computer interfaces to those 
incapable of using standard devices due to disability. 

The object of this invention is to provide the means to either replace or augment the 
existing human computer interfaces by allowing the operator to control the cursor 
position by motion of the user observed by a video camera. This motion may be imparted 
by the up-down left-right motion of the user's head, hands or other motion presented to 
the video camera. In a close up view of the users facial features, the up down, left-right 
motion may be imparted through rotation as the facial features observed by the camera 
v^ll appear as a translation. While systems are available to achieve this capability, they 
remain prohibitively expensive to the general public. These costs are driven by the 
techniques and algorithms by which these systems detect the user head motion. 
Additionally, all techniques in existence today require the user to augment the system by 
wearing a detectable target or an apparatus which emits or detects a signal, all of which 
are cumbersome to the user. The unique element in this invention which brings the 
system cost into the range of the general public is a unique and efficient algorithm for 
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detecting user motion without the aid of augmentation by artificial devices placed on the 
operator. 

One object of this invention is to provide a means of human control of a graphical 
computer interface through the physical motion of the user in order to control the activity 
of a cursor in the manner usually accomplished with a computer mouse. 

A further object of this invention is to provide additional degrees of freedom in the 
human computer interface in support of computer games and entertairmient software. 

A further object of this invention is the dual use of the electronics and camera used for 
human computer interface in support of teleconferencing and video frame capture. 

Sunmiarv of Invention 

The invention provides the means for a human to interface to a computer via physical 
motion of the user, i.e. the users head or hands. It also provides a human factors 
approach to cursor movement in which the users rate of motion determines the relative 
motion of the cursor i.e. the faster the users head travels over set distance the further the 
corresponding cursor movement. 

In one aspect, the invention includes an electro-optic sensor consisting of an array of 
imaging elements. This sensor may be either a visible light camera utilizing ambient 
lighting conditions or a camera sensitive in another band such as the near IR, in which the 
illumination source is from an IR lamp which is beyond human sensory perception. This 
sensor is mounted facing the user in such a way that the user's face is captured by the 
sensor. 

The images captured by the sensor are processed by a digital signal processor located 
either in the users computer. The detected motion of the user is communicated to the 
user's operating system via the pci bus interface. These commands are interpreted by a 
low overhead program operating on the users main processor that either updates the 
cursor position on the screen or provides motion information to the users computer game. 

Alternatively, the digital signal processor may be mounted in the camera housing such 
that the camera/signal processing subsystem produces signals which emulate the mouse 
via the mouse input connector. 

The pixel format of the camera drives the accuracy of the system. It should be noted that 
the greater the density of pixels on the image of the users face, the higher the resolution 
of cursor motion which can be attained. Camera formats of 240 vertical by 320 
horizontal should provide satisfactory performance. The number of pixels that may be 
utilized is strictly determined by system cost factors. The greater the number of pixels, 
the more powerful the DSP must be in order to process the image sequences in real time. 
Current technology limits the processing density to a 64x64 window for a consumer. 
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While this is satisfactory for the general user, a higher fidelity system using a greater 
number of pixels is possible, at a proportionally higher cost. 

The data transfer rate from the camera is 30 frames/second at 240x320 pixels per frame. 
Assuming eight bits per pixel, the digital data transfer rate is therefore 1 8.432 
megabits/second. This is a high transfer rate for a consumer product using current 
technology. While the data transfer can be either analog or digital, the preferred method 
of image data transfer for this invention is therefore via a standard RS 170 analog video 
interface. 

Brief Description of the Drawings 

Figure 1 illustrates the human computer interface according to the invention 
Figure 2 illustrates the fiinctions required on the printed circuit card 
Figure 3 illustrates the algorithm that forms the basis of the invention 
Detailed Description of the Drawings 

Figure 1 illustrates the major components of the human computer interface. The user 5 
sits facing the computer display 10. A camera 15 is mounted to the monitor facing the 
computer user 5. The camera 15 is mounted in such a way that the user's face is imaged. 
The camera 1 5 is interfaced to a printed circuit card 20 mounted in the users computer 
chassis 25. The camera 1 5 interfaces to the printed circuit card 20 via a camera interface 
cable 30. 

Figure 2 illustrates the functions required on the printed circuit board 20 mounted in the 
users computer 25. A camera interface circuit 50 receives the video data. In the preferred 
implementation, this data is RSI 70 format. This function decodes the analog video data 
to determine the video timing signals embedded in the analog data. These timing signals 
are used to control an analog to digital converter that converts the analog pixel data in to 
digital camera images. The analog data is digitized into 6 bits in the preferred 
implementation although any number of bits greater than six may be acceptable and/or 
required for other optional features on the printed circuit card. 

The frame difference electronics 55 receive the digital data from the camera interface 
circuit 50. The frame difference electronics include a single frame memory, a subtraction 
circuit and a state machine controller/memory addresser to control the data flow. The 
frame memory holds the previous digitized frame. As each digitized pixel is received by 
the frame difference electronics, the corresponding pixel from the previous frame is read 
from the frame memory and subtracted from the current. The resulting difference is 
output to the N frame video memory. The new frame pixel data is then stored into the 
frame difference electronics frame memory. 
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The N frame video memory electronics 60 receives either the differenced frames output 
by the frame difference electronics 55 or the raw digitized frames from the camera 
interface 50, The choice of frame source is made by the software resident on the digital 
signal processor 65. This frame video memory is sized to hold greater than one full 
frame of video up to a number of frames, N. The number of frames is to be driven by the 
final hardware and software design. 

The digital signal processor 50 implements the algorithm unique to this invention. This 
algorithm determines the rate of head motion of the user in two dimensions. The digital 
signal processor also detects the eye blink of the user in order to emulate the click and 
double click action of a standard mouse button. In support of these functions, the digital 
signal processor commands the N frame video memory 60 to supply either the 
differenced frames or the raw digitized frames. The digital signal processor requires a 
supporting program memory 70 made up of electrically reprogrammable memory 
(EPROM) and data memory consisting of standard volatile random access memory 
(RAM). The digital signal processor is also provided with an interface to the PCI bus 
interface electronics 80 through which the cursor and button emulation is passed to the 
users main processor. The PCI interface also provides the means to pass raw digitized 
video to the main processor as a user optional feature. This same interface provides the 
means, via the digital signal processor interface, to reprogram the program memory 70, 
alloMdng for software upgrades to provide additional features and performance. 

The PCI interface electronics provide an industry standard bus interface supporting the 
aforementioned communication path between the printed circuit card 20 and the users 
main processor 25. 

The optional MPEG compression electronics 85 provide the printed circuit card and 
camera to provide compressed video to the users main processor. This compressed video 
supports the use of the human computer interface electronics and camera in 
teleconferencing applications. This dual use as either human computer interface and 
teleconferencing is a unique combination offering the user an economical solution to two 
distinct applications. 

Figure 3 describes the head motion algorithm upon which this invention is based. Note 
that not all of the ftmctions shown in Figure 3 are implemented in software in the DSP. 
As with other algorithms typical of the detection of motion, this algorithm relies on the 
correlation of images from one frame to the next. The unique aspect of this invention lies 
in the use of frame differenced images in the correlation process. The frame differencing 
operation removes all parts of the camera images that are unchanged from the previous 
frame. All room background behind the user is therefore removed from the image. This 
greatly simplifies the detection of feature motion. The users face image consists of 
regions of uniform illumination such that even with the users facial motion, these uniform 
regions (i.e. cheeks, forehead, chin) will also be removed. Note the users face also 
consists of dynamic features such as nose, eyes, eyebrows and mouth which have enough 
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spatial detail that they will be evident in the differenced image. As the user moves his 
face with respect to the room lighting, the shape and distribution of these features will 
change, but the frame rate of the camera ensures that these features look similar from one 
frame to the next. The correlation process therefore operates to determine how these 
differenced features are moving from one frame to the next in order to determine user 
head motion. 

The algorithm receives video images of the user, imaged over time. Each image received 
is provided to both a frame memory 100 and a differencer 105. Though the preferred 
implementation is to buffer a single frame in this memory, the memory may consist of 
many frames, buffered such that the first frame input is the first frame output (FIFO). 
The delayed frame is read from the frame memory 100 and subtracted from the current 
frame using the differencer 105. The frame output is provided to both a correlation 
process 115 and a difference frame memory 1 10. 

Like the frame memory, while the preferred implementation requires only a single 
difference frame, the difference frame memory 1 1 0 can hold many difference frames in 
sequence in a FIFO arrangement. The delayed difference frame is read from the 
difference frame memory and provided to the correlation function 115. The correlation 
process 1 1 5 determines the best combination of row and column shifts in order to 
minimize the difference between the current difference frame and the delayed difference 
frame. The number of rows and columns required to align these difference images 
provides information on the users motion. The best-fit function 120 determines the row 
and column shift giving the optimum alignment. In the case of a classical correlation 
process, the best-fit function 120 would consist of a peak detect. 

The best-fit function 120, provides the relative motion in rows and columns of the 
observed users features. The cursor update compute function 125 translates this 
measured motion into the position change required of the cursor. This is likely to be a 
non-linear process such that wdth greater head motion, the cursor moves a non- 
proportionally greater distance. For example a 1 -pixel user motion will cause the cursor 
to move one screen pixel while a 10-pixel user motion may cause a 100-pixel screen 
cursor motion. 

The video cursor control 135 provides a user interface to enable and disable the operation 
of the video human motion cursor control. This control is implemented either through a 
combination of keystrokes on the users keyboard, by sensing the eye-blink of the user or 
through voice commands. The functionality of the video cursor control 135 is to provide 
the user with the equivalent of a mouse pick-up, put-down action. As the user move the 
cursor from left to right across the screen, the user would de-activate the motion based 
cursor control in order to allow the user to move his head back to the left. Once the user 
has recentered his head, the user would once again activate the cursor control and 
continue to move the cursor about the screen. The activation/deactivation of the mouse 
input is represented by the switch 140, such that the open position of the switch disables 
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the human motion control of the cursor and suppUes a zero change input to the 
summation operation 130. 

When video cursor control 135 enable the human motion control of the cursor, the resuh 
of the cursor update compute function 125 is added to the known current cursor position 
by the summer 130. This summation has an x component and a y component. The result 
of the sunmiation 130 is used to update the cursor position on the users screen via the 
users operating system. The current cursor position is provided by the computer 
operating system. The cursor position may be controlled by both the user visible motion 
as well as the motion imparted by other input device such as a standard computer mouse. 

Figure 4 provides a detailed description of the preferred implementation of the algorithm 
described in functions 100-120 in Figure 3. 

Video data is received by the processing electronics in both a single frame memory 200 
and a differencer 205, The output of the frame memory 200 is also provided to the 
differencer 205 such that the previous frame is subtracted from the current frame. This 
differenced frame is than processed by a two dimensional FFT 210. 

The complex result of the FFT 210 is provided to both a complex multiplier 225 and a 
complex memory 215. The complex memory is the size of the processed image, each 
location containing both a real and imaginary component of a complex number. With 
each new FFT operation 210, the previous FFT result, contained in the complex memory 
215, is provided to the conjugate operation 220. The complex conjugate of each element 
is computed and provided to the complex multiplier 225. In this manner, the FFT of the 
previous frame difference is conjugated and multiplied against the FFT of the current 
difference image. 

The two dimensional array of complex products output by the complex multiplier 225 is 
provided to a two dimensional inverse FFT operation 230. This operation creates an 
image of the correlation function between the latest pair of difference images. The 
correlation image is processed by a peak detection function in order to determine the shift 
required aligning the two difference images. The x-y magnitude of this shift is 
representative of the users motion. This x-y magnitude is provided to the software that 
will be used to update the cursor position as described in Figure 3. 
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NEW ABSTRACT 



A human motion following controller(lO) is provided by the invention to augment 
motion of items (e.g.. computer cursor or scene view) shown on a computer 
display. The display (14) is coupled to the computer(20) which controls 
positioning of the items through operating system controls. A camera(16) captures 
frames of data corresponding to a first image of at least part of a user (e.g., eyes, 
hands) at the computer display. Signal processing electronics(18) coupled to the 
camera (a) detects differences between successive frames of data corresponding to 
motion of the first image, and (b) communicates differences information to the 
computer to reposition display of the items through the operating system controls. 
The items are dius repositioned on the display by an amount corresponding to the 
motion of first image. 
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Background 



The primary human interfaces to today's computer are the keyboard, to enter 
textual information, and the mouse, to provide control over graphical information. 
These interfaces help users with word processing, presentation software, computer 
aided design packages, spreadsheet analyses, and other applications. These 
interfaces are also widely used for computer gaming entertainment; though they are 
often augmented or replaced by a joystick. 

In daily use of business software applications, control of cursor position on 
the screen requires that the user remove his/her hand from the keyboard in order to 
use the standard mechanical mouse. The use of the mouse introduces several issues. 
In a desk environment, the mouse requires maintenance of space on the desk area. 
The mouse cord must also remain free from obstruction to facilitate movement. 
Additionally, the use of the mouse is a major contributing factor of carpal-tunnel 
syndrome. It would be advantageous therefore to find an alternative to the 
mechanical mouse. 

In computer gaming, game complexity generally requires control of the (i) 
mouse and keyboard, or (ii) joystick and keyboard. Further, gaming applications 
usually require control in several axes of motion, including forward motion, reverse 
motion, left turn, right turn, left strafe (slide), right strafe, upward motion, 
downward motion. To further complicate game maneuvers and control, many 
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games permit viewing (within the game environment) in directions different from 
that in which the vehicle (e.g., the car, or person, simulated within the game) is 
moving, including up, down, left and right. These many complexities of motion in- 
fact increase or modify the complexity and enjoyment of the game. 

5 

Nevertheless, these complexities require that the user have utmost dexterity 
and control of his/her body. One object of the invention, therefore, is to offer 
alternative approaches to human-computer interfaces for those incapable of using 
standard devices (e.g., mouse, keyboard and joystick) such as due to disability. 

10 

Another object of the invention is to provide an alternative input device for laptop 
computers. Laptop computers are used in locations which do not allow the use of a mouse, in 
airplanes or during business meetings in which there is no room to operate the mouse. 
Tluough the use of either a clip on camera or a camera built into the laptop display, the laptop 
15 user can control the mouse position or use the camera for teleconferencing while on the road. 

Other objects of the invention are to replace or augment existing human computer 
interfaces to facilitate enhanced gaming and/or control within game environments. 

20 In the prior art, certain systems exist which attempt to reduce the amount of 

physical interaction required with game controllers. However, such systems are 
prohibitively expensive to the general public as their costs are driven by techniques 
and algorithms which detect user head motion based upon a detectable target worn 
by the user. Other costly and cumbersome systems require the user to wear 

25 apparatus which emits or detects a signal. It is thus one other object of this invention 
to provide a system which detects user motion without the aid or augmentation of 
artificial devices placed on the user operator. 
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Another object of the invention is to provide a means of human control of a 
graphical computer interface through the physical motion of the user in order to 
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control the activity of a cursor in the manner usually accomplished with a computer 
mouse. 

A further object of the invention is to provide additional degrees of freedom 
5 in the human computer interface in support of computer games and entertainment 
software. 

Yet another object of the invention is to provide dual use of teleconferencing 
and video electronics with gaming and computer control systems. 

10 

These and other objects will be apparent in the description which follows. 
Summary of Invention 

15 As used herein, "cursor" means a computer cursor associated with a 

computer screen. "Scene view" means the view presented on a computer display to 
a user. For example, one scene view corresponds to the scene presented to a user 
during a computer game at any given moment in time. The game might include 
displaying a scene whereby the user appears to be walking in a forest, and through 

20 trees. In another example, a cursor might also be visible in the scene view as a 
mechanism for the user to select certain events or items on the scene (e.g., to open a 
door in a game, or to open a folder to access computer files). 

As used herein, "camera" refers to a solid state instrument used in imaging. 
25 Typically, the camera also includes optical elements which refract light to form an 
image on the camera's detector elements (typically CCD or CMOS). For example, 
one camera of the invention derives from a video-conferencing camera used in 
conjunction with Internet communication. 

30 In one aspect, the invention provides systems and methods to control 

computer cursor position (or, for example, the scene view or game position as 
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displayed on the computer display) by motion of the user at. the computer. A 
camera rests on or near to the computer, or built into the computer, and connects 
therewith to collect "frames" of data corresponding to images of the user. These 
• linages provide information about user motion, over time. Software within the 
computer assesses these frames and algorithmically adjusts cursor motion (or scene 
view, or mouse button, or some other operation of the computer) based upon this 
motion. The motion may be imparted by up-down or left-right motion of the user's 
head, by the user's hands, or by other motions presented to the video camera (such 
as discussed herein). In one aspect, a close up view of the users facial features is 
used to impart a translation in the cursor (or scene view) even through the features 
in fact rotate with the user's head. In yet another aspect, the rotation is used to 
generate a corresponding rotation in computer game scene imagery. 

In one aspect, the invention also provides a human factors approach to cursor 
movement in which the user's rate of motion determines the relative motion of the 
cursor (or scene view). By way of example, the faster the user's head travels over a 
siet distance, the further the corresponding cursor movement over the same time 
period. 

In other aspects of the invention, the camera is either (a) a visible light camera 
utilizing ambient lighting conditions or (b) a camera sensitive in another band such 
as the near infrared ("IR"), the IR, or the ultraviolet ("UV") spectrum. In the latter 
case (b), the illumination preferably emanates from a source such as an IR lamp 
which is beyond human sensory perception. The sensor is typically mounted facing 
the user so as to capture a picture of the user's face in the associated electromagnetic 
spectrum. The lamp is typically integrated with the camera housing so as to facilitate 
production and ease of consumer set-up. 

In one aspect, a system of the invention provides an IR camera (i.e., a camera 
which images infrared radiation) to image the user's face and to gauge the user's 
stress level associated with a game on the computer. As the user's intensity increases 



(such as in a fast moving computer game using a joystick or the methods discussed 
herein), the system detects increased heat intensity on the user's face, forehead or 
other body part by the imagery of the IR camera. This information is fed back into 
: the game processor to provide further enhancement to the game. In this manner, the 
system gauges the user's reaction to the game and modifies game speed or operation 
in a meaningful way. For example, suppose such a system determined that a 
particular user was bored of the present game speed (a determination of boring can 
be made by assessing low IR output over large portions of the user's face). The 
computer processor and game software can then cooperate to increase the gaming 
speed and thereby increase this particular user's stress. Games of the invention are 
thus made and sold to users with varying intelligence, age and/ or computer 
familiarity; and yet the system always "pushes the envelope" for any given user so 
as to make the game as interesting as possible, automatically. 

In accord with one aspect of the invention, images captured by the sensor are 
processed by a digital signal processor ("DSP") located either (a) in a PC card within 
: the host computer or (b) in a housing integrated with the sensor. In case (a), sensor 
frames are sent to the PC card; and detected user motion (sometimes denoted herein 
as "difference information") is communicated to the user's operating system via a 
PCI (or USB or later standard) bus interface. These difference information commands 
are interpreted by a low overhead program resident at the user's main processor, 
which either updates the cursor position on the screen or provides motion 
information to the user's computer game (e.g., so as to change the scene view). In 
case (b), the DSP is contained within the camera housing; and frames are processed 
local to the camera to determine difference information. This information is then 
transmitted to the computer by a cable that connects to a bus port of the computer so 
that the host processor can make appropriate movements of the cursor or scene 
view. In another aspect, the DSP is mounted in the camera housing such that the 
camera/ signal processing subsystem produces signals which emulate the mouse via 
the mouse input connector. 



In an alternative configuration, frames of image data are sent directly to the 
host computer through the computer bus; and that image data is manipulated by the 
computer processor directly. With increasing computer processing speed, it is 
expected that sensor data frames can be sent directly to the host processor for all 
processing needs, in which case the PC card and/ or separate DSP are not required. 
Although this is possible today, the update rates are likely too slow for practicality. 
Once GHz processors are on the inarket, a separate DSP may no longer be needed. 

In one aspect of the invention, pixel format or pixel density of the camera 
drives the accuracy of the system. Higher pixel density in the image of the user's 
face, for example, increases the attainable resolution and cursor control (or the 
attainable control of scene view motion). Camera formats of 240 vertical by 320 
horizontal generally provide satisfactory performance. The number of pixels that 
may be utilized is determined by system cost factors. Greater numbers of pixels 
require more powerful DSPs (and thus more costly DSPs) in order to process the 
image sequences in real time. Current technology limits the processing density to a 
64x64 window for consumer electronics. As prices are reduced, and power 
increases, the densities can increase to 128x128, 256x256 and so on. While 64x64 
density is satisfactory for general household users, a higher fidelity system using a 
greater number of pixels is possible, in accord with the invention, for higher end 
applications at a proportionally higher cost. 

In one aspect, the data transfer rate from the camera is 30 frames/ second at 
240x320 pixels per frame. Assuming eight bits per pixel, the digital data transfer rate 
is therefore 18.432 megabits/ second. This is a fairly high transfer rate for consumer 
products using current technology. While the data transfer can be either analog or 
digital, the preferred method of image data transfer for this aspect is via a standard 
RSI 70 analog video interface. 

In accord with one aspect, a system of the invention defines two imaging 
zones (either within a single camera CCD or within multiple CCD cameras housed 



within a single housing). One imaging zone covers the user's head; and the other 
covers the user's eyes. This aspect includes processing means to process both zones 
whereby movement of the user's head provides one mechanism to control cursor 
movement (or scene view motion), and whereby the user's eyes provide another 
5 mechanism to control the movement. In essence, this aspect increases the degrees of 
freedom in the control decision making of the system. By way of example, a user 
might look left or right within a game without moving his head; but by assessing 
movement of the user's eyes (or the pupils of those eyes), the scene view can be 
made to rotate or translate in the manner desired by the user. Further, a user might 
10 move his head for other reasons, and yet not move her eyes from a generally 
forward looking position; and this aspect can assess both movements (head and 
eyes) to select the most appropriate movement of the cursor or scene view, if any. 

In another aspect, a system of the invention utilizes a camera with zoom 
15 optics to define the user's pupil and to make cursor or scene views move according 
to the pupil. In another aspect, the system incorporates a neural net to "learn" about 
a user's eye movements so that more accurate movements are made, over time, in 
response to the user's eye movement. 

20 In still another aspect, a neural net is used to learn about other movements of 

the user to better specify cursor or scene view movement over time. 

In yet another aspect of the invention, a system is provided with two CCD 
arrays (either within a single camera body or within two cameras). The arrays 
25 connect with the user's computer by the techniques discussed herein. One CCD 
array is used to image the user's head; and the other is used to image the user's 
body. Motion of the user is then evaluated for both head and body movement; and 
cursor or scene view movement is adjusted based upon both inputs. 

30 In another aspect of the invention, a single CCD is used to image the user. 

However, alternate frames are zoomed, electronically, so that one frame views the 
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user's head, and the next frame views the user's eyes. With the algorithm discussed 
herein, these separate frame sequences (one for the eyes, one for the head) are 
processed separately and evaluated, together, to make the most appropriate cursor 
or. scene view movement. If for example, the system clocks at 30Hz, then one set of 
frame sequences operates at 15Hz, and the other at l5Hz. However, the advantage is 
that two movement information sets can be evaluated to invoke an appropriate 
movement in the cursor or scene view. 

Those skilled in the art should appreciate that different frame rates can be 
used; and frame rates for either sequence (head or eyes) can occur at different rates 
too. Further, the separate frame sequences can utilize other body parts, e.g., the head 
and the hand, to have two movement evaluations. Alternatively, a separate camera 
(or CCD array) can be used to image other body parts, for example one camera for 
the head and one for the hand. 

The invention also provides methods for shifting cursor or scene views in 
response to user movement. In one aspect, the scene view shifts left or right when 
the user shifts left and right. In another aspect, the scene view rotates when the 
user's head rotates. This last aspect can be modified so that such rotation occurs so 
long as the eyes do not also rotate (in this situation, the user's head rotates, 
indicating that she wishes the scene view to rotate; but the eyes do not, indicating 
that she still watches the game in play). In another aspect, the scene view rotates in 
response to the user's hand rotation (i.e., a camera or at least a CCD array of the 
system is arranged to view the player's hand). 

In another aspect, the invention provides a multi-zone player gaming system 
whereby the user of a particular computer game can select which zone operates to 
move the cursor or the scene view. By way of example, the system can include one 
zone corresponding to a view of the user's head, where frames of data are captured 
by the system by a camera. Another zone optionally corresponds to the user's hand. 
Another zone optionally corresponds to the user's eyes. Each zone is covered by a 




camera, or by a CCD array coupled within the same housing, qr by optical zoom 
zones within a single CCD, or by separate optical elements that image different 
portions of the CCD array. By way of example, two zones can be covered with a 
' single CCD array (i.e., a camera) when the zones are the user's head and eyes. The 
camera images the head, for one zone, and images the eyes in another zone, since the 
zones are optically aligned (or nearly so). However, two cameras (or optionally two 
CCD arrays with separate optics) can view two zones such as the user's head and the 
user's hand. Combinations of zones is also possible and envisioned in accord with 
the invention. 

Zones in a single camera can also be identified by the computer by 
prompting the user for motion from corresponding body parts. For instance, the 
computer identifies the head zone by prompting the user to move his head. Then 
the computer identifies the foot zone by having the user move his foot. Once the 
zones are identified, the motion of each of these individual zones is tracked by the 
computer and the regions of interest in the camera image related to the zones moved 
as the targets in the zones move with respect to the camera. 

In one aspect, the invention provides a system, including a camera and edge 
detection processing subsystem, which isolates edges of the user's body, for 
example, the side of the head. These edges are used to move the cursor or scene 
view. For example, if the left edge of head is imaged onto column X of one frame of 
the CCD within the camera, and yet the edge falls in column Y in the next frame, 
then a corresponding movement of the cursor or scene view is coixunanded by the 
system. For example, movement of the edge from one column to the next might 
correspond to ten screen pixels, or other magnification. In one aspect, this 
magnification is selected by the user. Up and down motion can also be detected by 
similar edge detection. For example, by imaging the user's chin, an edge movement 
in the up or down dimension is formed (e.g., if the bottom edge of the chin moves 
from one row to the next, in adjacent frames, then a corresponding movement of the 
cursor or scene view is made - magnification again preferably set manually with a 



default starting magnification). Other images can also serve to. define edges. For 
example, in one aspect, a user's eyelash can be used to move the cursor (or scene 
view) up or downwards; though typically the eye blink is used to reset the cursor 
command cycle. 

In one aspect, an optical matched filter is used to center image zones onto the 
appropriate images. For example, as discussed above, one aspect preferably utilizes 
64x64 pixels as the image frame from which cursor motion is determined. Many 
cameras have, however, many more pixels. These 64x64 arrays are therefore 
preferably established tlirough matched filtering. By way of example, an image of a 
standard pair of user's eyes is stored within memory (according to one aspect of the 
invention). This image field is cross-correlated with frames of data from the actual 
image from the camera to "center" the image at the desired point. With eyes, 
specifically, ideally the 64x64 sample array is centered so as to view both eyes within 
the 64x64 array. Similarly, to process sequences of head data, a standard head image 
is stored within memory, according to one aspect, and correlated with the actual 
image to center the head view. 

Those skilled in the art should appreciate that an appropriate frame size can 
be established from an image having more or fewer pixels, by redundantly allocating 
data into adjacent pixels or by eliminating intermediate pixels, or similar technique. 

In another aspect, a camera is provided which optically "zooms" to provide 
optimal imaging for a desired image zone. By way of example, the invention of one 
aspect takes an image of the user's head, determines the location of the user's eyes 
(such as by matched filtering), and optically zooms the image through movement of 
optics to provide an image of the eyes in the desired processing size format 

Many aspects of the invention are preferably enhanced by autofocus. 
Specifically, it is often desirable to have a crisp image of the user (or a part of the 
user, e.g., the user's eyes) in order to accurately process desired cursor or scene view 
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movement. Thus, autofocus capability preferably operates in most of the aspects of 
the invention where imaging is a feature of the processing. 

In one aspect, the camera utilizes a very small aperture which results in a very 
large depth of field. In such a situation, autofocus is not required or desired. The 
optical requirements for the lenses are also reduced. 

The invention thus provides several advantages over the art. For example, 
game controllers can now include feedback corresponding to the user's actual 
movement. By way of another example, if the user moves left or right (or head or 
hand or eyes move left or right, depending on the image zone), then the cursor (or 
scene view) can also be set to move left or right. When the user twists her head, for 
example, the scene view can also be made to rotate, reflecting that movement. 

Those skilled in the art should appreciate that the direction in which the scene 
moves, left or right, is a matter of design choice. That is, certain games might find it 
desirable to move the opposite direction from what the user moves, to add certain 
challenges to the game. Further, in other aspects, this direction can change during 
the game to further complicate game control. 

In accord with one aspect of the invention, a processing subsystem (cormected 
with the camera) is used to make cursor movement (or scene view movement) 
correspond to user's motion. This processing subsystem of another aspect further 
detects when the user twists liis head, to add an additional dimension to the 
movement. 

In one aspect, a system of the invention includes an IR detector which is used 
to determine when a person sweats or heats up (by imaging, for example, part of the 
user's head onto the IR detector); and then the system adjusts game speed in a way 
corresponding to this movement. Alternatively, a heartbeat sensor is tied to the 
person to sense increased excitement during a game and the system speeds or slows 
the game in a similar manner. Note that a heartbeat sensor can be constructed, in one 
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aspect of the invention, by thermal imaging of the user's face, detecting blood flow 
oscillations indicative of heartbeat. In other aspects, the heartbeat sensor is 
physically tied to the user, such as w^ithin the computer mouse or joystick. 

In one aspect, a computer of the invention adapts to user control as selected 
by a particular user. For example, in the case of a handicapped person, a particular 
user might select certain hand-movements, e.g., a single finger up, to move the 
cursor up; and another finger down to move the cursor left An infinite combination 
of controls can be established; however this is one advantage of the invention in that 
users with many different disabilities can program cursor or scene view movement. 
In one aspect, a neural network is used to assist the processing system in establishing 
proper cursor movement. In another aspect, the computer for example learns to 
print something by movement of the user's finger (or other body part). 

In one aspect, tipping of the user's head (or other body part, or object) is used 
to provide another degree of freedom in moving the cursor or adjusting the scene 
view. By way of example, a tilt of the head, as imaged by the camera, can be set to 
command a rotation of the scene view. 

In still another aspect, a camera of the invention uses autozoom to move in 
and out of a given scene view. By way of example, the camera is first focussed on the 
user's face in one frame; but in a subsequent frame the camera must focus to closer 
to compensate for the fact that the user moved closer to the camera (typically, the 
camera is on the monitor, so this also means that the user moved closer to the scene 
view). This autofzoom is used, in one aspect, to make the scene view appear as if the 
user is "creeping" into the scene. By moving the scene in and out, the user will 
perceive that he is moving in or out of the scene view. 

In another aspect, a camera images an object held by the user. Preferably, the 
object has a well-defined shape. The system images the object and determines 
difference information corresponding to movement of the object. By way of example, 
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rotating the object upside down results in difference information that is upside 
down; and then the scene view inverts by operation of the system. In another 
example, twisting of the object rotates the scene view left or right, or rotates the 
scene in the direction of the twisting. 



In another aspect, two cameras image the user: one camera pointed at the 
front of the users face or hand and the other down at the top of the users head or 
hand. The front facing camera is used to detect rotational and linear translation in 
up-down and left-right directions. The top viewing camera determined front-back, 

10 left right translation. The front-back translation observed by the top camera is used 
to control forward and back motion in the users 3-D view. The top sensed left-right 
translation controls the users left right slide or strafe. The top sensed left-right 
motion is removed from the front view left-right translation with the remaining 
front view measure representative of left-right twist. All of the front view up-down 

15 translation can be interpreted as up-down twist. 

Brief Description of the Drawings 
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. Figure 1 illustrates one human computer interface system constructed 
20 according to the invention; 



Figure 1 A shows an exemplary computer display illustrating cursor 
movement made through the system of Figure 1; 
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Figure IB illustrates overlayed scene views, displayed in two moments of 
time on the display in Figure 1, of a shifting scene made in response to user 
movement captured by the camera of Figure 1; 



Figure IC shows an illustrative frame of data taken by the system of Figure 1; 
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Figure 2 illustrates selected functions for a printed circuit card used in the 
system of Figure 1; 

Figure 3 illustrates an algorithm block diagram that preferably operates with 
the system of Figure 1; 

Figure 4 illustrates one preferred algorithm process used in accord with the 
invention to determine and quantify body motion; 

Figure 5 shows one process of the invention for communicating body motion 
data to a host processor, in accord with the invention, for augmented control of 
cursor position or scene view; 

Figure 5 A shows representative frame of data of a user taken by a camera of 
the invention, and further illustrates adding synibols to key body parts to facilitate 
processing; 

Figure 6 illustrates a two camera imaging system for implementing the 
teachings of the invention; 

Figure 7 illustrates two positions of a user as captured by a camera of the 
invention; and Figure 7A illustrates two positions of a scene view on a display as 
repositioned in response to movement of the user illustrated in Figure 7; 

Figure 8 illustrates motion of a user - and specifically twisting of the user's 
head - as captured by a system of the invention; Figure 8A illustrates a first scene 
view corresponding to a representative computer display before the twisting; Figure 
SB illustrates a second scene view corresponding to a rotation of the first scene view 
in response to the twisting by the user; Figure 8C shows processing features of the 
processing section of Figure 8; and Figure 8D illustrates multiple image frames 
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stored in memory for matched filtering with raw images acquired by the system of 
Figure 8; 

Figure 9 illustrates a two camera system of the invention for collecting N 
zones of user movement and for repositioning the cursor or scene view as a function 
of the N movements; Figure 9A illustrates a representative thermal image captured 
by the system of Figure 9; and Figure 9B illustrates process methodology for 
processing thermal images as a real time input to game processing speed, in accord 
with the invention; 

Figure 10 illustrates another two camera system of the invention for targeting 
multiple image movement zones on a user, and further illustrating optional DSP 
processing at the camera section; 

Figure 11 illustrates framing multiple movement zones with a single imaging 
array, in accord with the invention; 

Figure 12 illustrates framing a user's eyes in accord with the invention; and 
Figure 12 A shows a representative image frame of a user's eyes; 

Figure 13 illustrates one system of the invention, including zoom, neural nets, 
and autofocus to facilitate image capture; 

Figures 14, 14A and 14B illustrate autofocus motion control in accord with 
the invention; 

Figure 15 illush'ates one other motion detect system algorithm utilizing edge 
detection, in accord with the invention; 

Figure 16 illustrates one other motion detect system algoritlxm utilizing well- 
characterized object maniputions , in accord with the invention; 



Figure 17 illustrates one other motion detect system algorithm utilizing varied 
body motions, in accord with the invention; 

Figure 18 illustrates a two camera system of the invention with a camera 
observing the user's face while the other observes the top of the user's head; and 

Figure 19 shows a blink detect system of the invention. 

Detailed Description of the Drawings 

Figure 1 illustrates, in a top view, certain major components of a human 
computer interface system 10 of the invention. A user 12 of the system 10 sits facing 
a computer monitor 14 with display 14a. A camera 16 is mounted on the computer 
monitor 14 facing the user 12. In the illustrated embodiment, the camera 16 is 
mounted in such a way that the user's face 12a is imaged within the camera's field of 
view 16a. However, as discussed herein, the camera 16 can alternatively image other 
locations, such as the user's hand, eyes, or on other objects; so imaging of the user's 
face, in Figure 1, must be considered illustrative, rather than limiting. Further, the 
camera location can also reside at places other than on top of the monitor 14. 

With further regard to Figure 1, the camera 16 interfaces with a printed circuit 
card 18 mounted within the user's computer chassis 20 (which connects with the 
monitor 14 by common cabling 20a). The camera 16 interfaces to the printed circuit 
card 18. via a camera interface cable 22. The circuit card 18 also has processing 
section 18a, such as a digital signal processing ("DSP") chip and software, to process 
images from the camera 16. 

In operation, the camera 16 and card 18 capture frames of image data 
corresponding to user movement 25. The processing section 18a algorithmically 
processes the image data to quantify that movement 25; and then communicates this 
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information to the host processor 30 within the computer 20. The host processor 30 
then commands movement of the computer cursor in . a corresponding movement 
25a, Figure lA (Figure lA illustrates a representative front view of the display 14a, 
and also illustrates movement 25a of the cursor 26 moving within the display 14a in 
response to user movement 25). 

Figure IB illustrates an alternative (or supplemental) process whereby the 
scene view shifts in response to user movement 25. Specifically, Figure IB illustrates 
a first scene view 35a, which generally corresponds to a forest prior to the user's 
movement 25; and an overlay ed scene view 35b (shown in dotted line, for purposes 
of illustration) that is shifted by an amount 37 in response to the user's movement 25. 
The shift 37 in the scene view 35 is accomplished by combined operation and 
processing of the processing section 18a and host CPU 30. 

Figure IC shows a representative frame 41 of data 43 as taken by the camera 
16, As illustrated, data 43 represents the user's face 12a taken at a given moment of 
time. Subsequent frames (not shown) are used to determine user motion 25 relative 
to the frame 41, as discussed herein. The frame 41 is made up of the plurality of 
pixel data 45, as known in the art. 

Figure 2 illustrates certain functions processed within the printed circuit 
board 18 of Figure 1. A camera interface circuit 50 receives video data from the 
camera 16 through interface cable 22. Tliis data can be RS170 format. Circuit 50 
decodes the analog video data to determine video timing signals embedded in the 
analog data. These timing signals are used for control of the analog-to-digital (A/D) 
converter included in circuit 50 that converts analog pixel data into digital images. 
In the preferred embodiment, the analog data is digitized into 6-bits, though any 
number of bits greater may be acceptable and/ or required for features as discussed 
herein. 
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The frame difference electronics 52 receives digital data from the camera 
interface circuit 50. The frame difference electronics 52 include a multiple frame 
memory, a subtraction circuit and a state machine controller/ memory addresser to 
conti'ol data flow. The frame memory holds previously digitized frame data. As 
each digitized pixel is received by the frame difference electronics 52, the 
corresponding pixel from a previous frame is read from the frame memory and 
subtracted from the current frame. The preferred implementation uses the frame 
just previous to the current frame, though an older frame which resides in the frame 
memory may be used. The resulting difference is output to an N-frame video 
memory 54. The new frame pixel data is then stored into the frame memory of the 
frame difference electronics 52. 

The N frame video memory electronics 54 either receives differenced frames 
output by the frame difference electronics 52 (discussed above) or raw digitized 
frames from the camera interface 50. The choice of where the data derives from is 
made by software resident on the DSP 56. The frame video memory 54 is sized to 
hold more than one full frame of video and up to N number of frames. The number 
of frames N is to be driven by hardware and software design. 

In the preferred embodiment, the DSP 56 implements an algorithm discussed 
below. This algorithm determines the rate of head motion of the user in two 
dimensions. The digital signal processor 56 also detects the eye blink of the user in 
order to emulate the click and double click action of a standard mouse button. In 
support of these functions, the DSP 56 commands the N frame video memory 54 to 
supply either the differenced frames or the raw digitized frames. The digital signal 
processor thus preferably utilizes a supporting program memory 58 made up of 
electrically reprogrammable memory (EPROM) and data memory 59 including 
standard volatile random access memory (RAM). The DSP 56 also interfaces to the 
PCI bus interface electronics 60 through which cursor and button emulation is 
passed to the user's main processor (e.g., the CPU 30, Figure 1). The PCI interface 60 
also passes raw digitized video to the main processor as an optional feature. 
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Interface 60 also permits reprograniming of program memory 58, to allow for future 
software upgrades permitting additional features and performance. 

The PCI interface electronics 60 thus provides an industry standard bus 
interface supporting the aforementioned communication path between the printed 
circuit card 18 and the user's main processor 30. 

With optional MPEG compression electronics 62, the printed circuit card 18 
and camera 16 can provide compressed video to the user's main processor 30. This 
compressed video supports using the system 10 in teleconferencing applications, 
providing dual use as either human computer interface system 10 and/ or as a 
teleconferencing system in an economical solution to two distinct applications. 

Figure 3 describes one preferred head motion block diagram algorithm 70 
used in accord with the invention. Not all of the functions shown in Figure 3 are 
implemented in software in the DSP 56. Rather, this algorithm relies on the 
correlation of images from one frame to the next, and particularly relies on the use of 
frame differenced images in the correlation process. The frame differencing 
operation removes parts of the camera images that are unchanged from the previous 
frame. For example, room background (such as object 13, Figure 1) behind the user 
12 is removed from the image. This greatly simplifies detection of feature motion. 
Even the image of the user's face image consists of regions of uniform illumination 
such that even with the user's facial motion, these uniform regions (i.e. cheeks, 
forehead, chin) may also be removed. The user's face 12a also consists of typically 
dynamic features such as the nose, eyes, eyebrows and mouth, each of which 
typically has enough spatial detail that will be evident in the differenced image. As 
the user moves his face with respect to room lighting, the shape and distribution of 
these features will change; but the frame rate of the camera 16 ensures that these 
features look similar from one frame to the next. The correlation process therefore 
operates to determine how these differenced features are moving from one frame to 
the next in order to determine user head motion 25. 
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The algorithm of block diagram 70, Figure 3, receives video images 72 of the 
user as imaged by camera 16 over time. Each received image is passed to both a 
frame memory 74 and a differencer 76. Though the preferred embodiment is to 
buffer a single frame in memory 74, the memory 74 may optionally store many 
frames, buffered such that the first frame input is the first frame output (FIFO). The 
delayed frame is read from the frame memory 74 and subtracted from the current 
frame using the differencer 76. Frame output from the differencer 76 is provided to 
both a correlation process 78 and a difference frame memory 80. 

Like the frame memory 74, the preferred embodiment of frame memory 80 
utilizes a single difference frame; however the difference frame memory 80 can hold 
many difference frames in sequence in a FIFO arrangement. The delayed difference 
frame is read from the difference frame memory and provided to the correlation 
function 78. The correlation process 78 determines the best combination of rov^ and 
column shifts in order to minimize the difference betv/een the current difference 
frame and the delayed difference frame. The number of rows and columns required 
to align these difference images provides information as to the user's motion. 

The best-fit function algorithm 82 determines the row and column shift to 
provide optimal alignment. In the case of a classical correlation process, the best-fit 
function can consist of a peak detect algorithm. This algorithiri may either be 
implemented in hardware or in software. 

The best-fit function algoritlim determines relative motion in rows and 
columns of the observed user's features. The cursor update compute function 
algorithm 84 translates this measured motion into the position change required of 
the cursor (e.g., the cursor 26, Figure lA). Typically, this is a non-linear process that, 
with greater head motion, the cursor moves a non-proportionally greater distance. 
For example a 1-pixel user motion can cause the cursor to move one screen pixel 
while a 10-pixel user motion may cause a 100-pixel screen cursor motion. However, 
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these magnifications can be adjusted for desired result. This algorithm may either be 
implemented in hardware or in software such as through an ASIC or FPGA. 

Video cursor control 86 provides a user interface to enable and disable the 
5 operation of cursor control described above. This control is implemented, for 
example, through a combination of keystrokes on the user's keyboard (for example 
as connected to the host computer 20, Figure 1). Alternatively, cursor control is 
activated or deactivated by sensing the eye-blink of the user (or some other 
predetermined movement), hi this alternative embodiment, an output signal 85 from 

10 the correlation section 78 is sent to the video enable section 86; and the output signal 
85 corresponds to blink data from the user's face 12a (Figure lA). In another 
embodiment, the video cursor control section 86 activates or deactivates cursor 
control by recognizing voice commands. A microphone 87 detects the user's voice 
and a voice recognition section 89 converts the voice to certain activate or deactivate 

15 signals. For example, the section 89 can be set to respond to "activate" as a voice 
command that will enable cursor control; and "deactivate" as a command that 
disables cursor control. 

The functionality of the video cursor control 86 provides the user with the 
20 equivalent of a mouse pick-up, put-down action. As the user moves the cursor from 
left to right across the screen, the user would de-activate motion-based cursor 
control in order to allow the user to move his head back to the left. Once the user 
has recentered his head, the user would once again activate the cursor control and 
continue to move the cursor about the screen. The activation/ deactivation of the 
25 mouse input is represented by the switch 90, such that the open position of the 
switch disables human motion control of the cursor and supplies a zero change input 
to the summation operation 92 in such conditions. 

Those skilled in the art should appreciate that control of scene view may also 
30 be implemented by an algorithm such as shown in Figure 3. Specifically, a similar 
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algorithm can provide movement of the current scene view, in accord with the 
invention. 

With video cursor control enabled, the result of the cursor update compute 
function 84 is added to the known current cursor position at the summing operation 
92. This summation has an x component and a y component. The result of the 
summation 92 is used to update the cursor position (or scene view) on the user's 
screen via the user's operating system. Cursor position may thus be controlled by 
both user motion as well as the motion imparted by another input device such as a 
standard computer mouse. 

Figure 4 provides a detailed description of the preferred implementation of 
the algorithm described in functions 74, 76, 78 and 82 of Figure 3. Video data is 
received by the processing electronics in both a single frame memory 100 and a 
differencer 102. The output of the frame memory 100 is also provided to the 
differencer 102 such that the previous frame is subtracted from the current frame. 
This differenced frame is than processed by a two dimensional FFT 104. The complex 
result of the FFT 104 is provided to a complex multiplier 106 and a complex memory 
108, The complex memory 108 is the size of the processed image, each location 
containing both a real and imaginary component of a complex number. With each 
new FFT operation 104, the previous FFT result, contained in the complex memory 
108, is provided to the conjugate operation 110. The complex conjugate of each 
element is computed and provided to the complex multiplier 106. In this manner, 
the FFT of the previous frame difference is conjugated and multiplied against the 
FFT of the current difference image. 

The two diinensional array of complex products output by the complex 
multiplier 106 is provided to a two dimensional inverse FFT operation 112. This 
operation creates an image of the correlation function 114 between the latest pair of 
difference images. The correlation image is processed by the peak detection function 
114 in order to determine the shift required in aligning the two difference images. 
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The x-y magnitude of this shift is representative of the user's • motion. This x-y 
magnitude is provided to the software used to update the cursor position as 
described in Figure 3. 

Figure 5 shows an algorithm process 130 of the invention which applies 
motion correlation operations over sub-frames of the video image. This allows 
motions of various body parts to convey input with specialized meaning to 
applications operating on the host computer. In addition to head motion, motion of 
the hands, arms and legs provide for greater degrees of freedom for the user to 
interact with the host application (e.g., a game). Commands of this type are useful in 
combative games where computer animated opponents fight under conti'ol of the 
user. In that instance, the hand, arm and leg motions of the user become punch, chop 
and kick commands to the computer after process 130. This command mode can also 
be used in situations where the user does not have ready access to a keyboard, to 
augment cursor control of the previously described head position correlator. 

Process 130 identifies the functions required to derive commands from 
general motions of the user's body. The scene analyzer function 132 receives 
digitized video frames from the camera (e.g., the camera 16 of Figure 1) and 
identifies sub-frames within the video for tracking various parts of the user's body. 
The frame difference function 134 and correlator function 136 provide similar 
functions as processes 74, 76 and 78 of Figure 3. The correlation analyzer 138 receives 
correlated difference frames from the correlator function 136 and sub-frame 
definitions from the scene analyzer 132. The correlation analyzer 138 applies a peak 
detection function to each sub-frame to identify the shift required to achieve best 
alignment of the two images. Correlation peaks occurring in the center of the sub- 
frame indicate no motion, while peaks occurring elsewhere indicate the direction 
and magnitude of the user's motion. The motion interpreter 140 receives motion 
vectors for each sub-frame from the correlation analyzer 138. The motion interpreter 
140 links the motion vector from each sub-frame with a particular body segment and 
passes this information onto the host interface 142. The host interface 142 provides 
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for communication with the host processor (e.g., CPU 30, Figure 1). It sends data 
packets to the host to identify detected body motions, their directions and their 
ampHtudes. The host interface 142 also receives instruction from the host as to which 
body segments to track which it then passed along to the motion interpreter 140 and 
the scene analyzer 132. 

The scene analyzer 132 first identifies the location of the user's body in the 
image and locates the position of various parts of the user's body such as hands, 
forearms, head, and legs. The techniques and methods used to identify the user's 
body location and body part positions can be accomplished using techniques well 
known to those skilled in the art (by way of example, via matched filtering). Body 
identification can also be augmented by marking different locations on the user's 
body with unique visual symbols. Unique symbols are assigned to key body joints 
such as elbows, shoulders, hands, neck, knees, and waist and are mounted on the 
body. See for example Figure 5 A. 

Figure 5A illustrates one frame 149 of data of an image of the user 150 as 
taken by a camera of the invention. The image corresponds to a full body image of 
the user 150, including arms 151, legs 152, elbows 151a, hands 153, head 154, neck 
155, ears 156, and forhead 157. These parts 151-157 are identified by processes of the 
invention (e.g., spatial location in the image, by matched filtering or other image 
recognition technique), and the image is preferably marked with unique symbols 
(e.g., "X" for center of the face, " Y" for center of the hand 153, ''T" for center of the 
user's foot, "Z" for body center, and "F" for forehead 157). 

With further reference to Figure 5, process 130 locates various body parts and 
preferably marks them with symbols to fill in connecting logic (e.g., the left wrist 
and left elbow symbol identify the location of the left forearm). Once the user's body 
parts are located, sub-frames surrounding each of the body segments identified by 
the host processor are generated. A sub-frame is a generally regularly shaped region 
within the image that surrounds a particular body part. The sub-frames are sized to 
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center the subject body part in the sub-frame and to provide enough room around 
the body part to accommodate typical body motions. One sub-frame 160 is shown in 
frame 149, Figure 5 A, surrounding the user's foot 'T". The scene analyzer 132 will 
generally not operate on each frame of video since continuously changing the sub- 
frames adds unnecessary complication to the correlation analyzer 138. Instead, the 
scene analyzer 132 runs as a background process updating the sub-frame locations 
periodically. 

Figure 4 provides a detailed description of one algorithm which can be used 
to implement processes 134-138 of Figure 5. 

The invention of one embodiment can thus track the motion of the user's 
body using symbols attached to key joints. As an example, the position of the user's 
left lower arm can be determined by locating the unique symbol for the left hand 
"Yl" and left elbow Unique symbols thus allow the processor to rapidly locate 
each portion of the user's body in a video frame. To determine the motion of a 
particular part of the user's body, the algorithm (e.g.. Figure 4) compares the 
position of the relevant body parts in consecutive frames and determines how they 
moved (for example, using geometry). Once motion is determined, it is then passed 
to the host CPU where the motion is acted on as appropriate for the particular 
application. 

Figure 6 illustrates a two camera system 200, constructed according to the 
invention. The cameras 202a, 202b are arranged to view separate parts of the user: 
camera 202a images the user's face 204; and camera 202b images the user's hand 206. 
The cameras 202 conveniently rest on top of the computer display 208 coupled to the 
host computer 210 by cabling 216. The cameras 202 couple to the signal processing 
card 212 residing within the computer 210 by cabling 213. As discussed herein, 
motion of the user's head 204 and/ or hand 206 are detected by the signal processing 
card 212, and difference information 'is cominunicated to the computer's CPU 210a 
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via the computer bus 214. This difference information corresponds to composite 
movement of the head 204 and hand 206; and is used by the CPU 210a to command 
movement of display items on the display 208 (for example, the display items can 
include the cursor or scene view as shown on the display 208 to the user). 
Iixformation shown on the display 208 is communicated from the computer 210 to 
the display 208 along standard cable 216. 

Figures 7 and 7A illusti'ate how motion of a user's head is for example 
translated to motion of the cursor and/ or scene view, in accord with the invention. 
Figure 7 shows a representative image 220 of a user captured within a frame of data 
by a camera of the invention. Figure 7 also shows a representative image 222 (in 
dotted outline, for clarity of illustration) of the user in a subsequent frame of data, 
indicating that the user moved "M" inches. Figure 7 A illustrates corresponding 
scene views on a computer display 224 that is coupled to processing algorithms of 
the invention (i.e., within a system that includes a camera that captures the images 
220, 222 of Figure 7). The display 224 illustratively shows a scene view that includes 
a road 224a that extends off into the distance, and a house 224b adjacent to the road 
224a. A computer cursor 224c is also illustratively shown on the display 224 as such 
a cursor is common even within computer games, providing a place for the user to 
select items (such as the road or house 224a, 224b) within the display 224. The 
display 224 also shows, with dotted outlines 226, the scene view of road and house 
which are shown on the display 224 after motion by the user from 220 to 222, Figure 
7 (the cursor 224c is for example repositioned to position 224c'). The repositioning of 
the scene view from 224a, 224b to 226 occurs immediately (typically much less than 
one second) after the movement of the user of Figure 7 from 220 to 222. The scene 
view is repositioned by x-pixels on the display 224, so that M/x corresponds to the 
magnification between user movement and scene view repositioning. This 
magnification can be set by parameters within the system; and can also be set by the 
user, if desired, at the computer keyboard. Furthermore, the rate at which the scene 
view moves the distance of x-pixels preferably occurs at the same rate as the rate of 
travel along distance M. Alternatively, the magnification can be dependent on the 
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rate of motion such that a larger displacement of x-pixels will occur for a given 
motion M if the rate of change of M is larger. 

Figure 8 illustrates a further motion that can be captured by a camera of the 
invention and processed to reposition a scene view, as shown in Figures 8A and 8B, 
More particularly. Figure 8 illustrates a camera 250 connected to a processing section 
252 which converts user motion 254 to corresponding repositioning of the computer 
scene view. As above, the user 256 is captured by the camera's field of view 258 and 
frames of data are captured by the processing section 256. hi Figure 8, motion 254 
corresponds to a twisting of the user's head 256; and processing section 252 detects 
this twisting and provides repositioning information to the host computer (not 
shown). Figure 8 A shows a representative scene view 260 on a display 262 coupled 
to the host computer. Figure 8B illustrates repositioning of the scene view 260' after 
the processing section 252 detects motion 254 and updates the host computer with 
difference information (e.g., that information which the host computer uses to rotate 
or translate the scene view). 

Figure 8 A also illustrates the intent of the rotating scene view feature. In 
Figure 8A, a person 260a is shown in the scene view 260, except that the person 260a 
is almost completely obscured by the edge 262a of the display 262. By twisting the 
head 256 in motion 254, the scene view 262 is rotated in the corresponding direction 
- as shown by scene view 260' in Figure 8B - so that the user 260a' is completely 
visible within the scene view 260'. 

Figure 8C illusti'ates further detail of the processing section 252. Camera data 
such as frames of images of a user are input to the section 252 at data port 266. The 
data are conditioned in the image conditioning section 268 (for example, to reduce 
correlated noise or other image artifacts). Thereafter, the camera data is compared 
and correlated in the image correlation section 270, which compares the present 
frame image with a series of stored images from the image memory 272. In the 
preferred embodiment, the present data image frame 249 is cross-correlated with 
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each of the images within the image memory 272 to find a match. These images 
correspond to a series of images of the user in known positions, as illustrated in 
Figure 8D. 

In Figure 8D, various images are stored representing various known positions 
of relevant part here the user's head 256. In the position of Figure 8, for example a 
straight on face shot the 0° stored memory image would provide the greatest cross- 
correlation value indicating a matched image position. Accordingly, the scene view 
would adjust to a zero position. If, however, the image correlated to a -90° position, 
the scene would rotate to such a position. Other movements cause additional scene 
view motions, including tilt and tip of the head, as shown in the two images "0°, 
Down 45°" image and the "0°, Up 45°". These images cause the scene view to move 
upwards or to tilt up or down, when the process section 252 correlates the current 
frame to these images. As indicated, these images have no left or right component, 
though other images (not shown) can certainly include left, right and tip motion 
simultaneously. 

Figure 9 shows a system 300 constructed according to the invention and 
including a camera section 302 including an IR imager 304 and a camera 306, both of 
which view and capture frames of data from a user 308. The IR imager 304 can 
include, for example, a microbolometer array (i.e., "uncooled" detectors known in 
the art) which produces a frame of data corresponding to the infrared energy 
emitted from the user, such as illustrated in Figure 9A. Figure 9A shows a 
representative frame of IR image data 310, with zones 312 of relatively hot image 
data emitting from regions of forehead, nose and mouth of the user 308. 

The cameras 304, 306 send image data back to the signal processing section 
314: Data from the camera 306 is processed, if desired, as above, to determine 
difference information signal 322 used by a cormected computer to reposition the 
cursor and/ or scene view. Data from camera 304, on the other hand, is used to 
evaluate how much (or how hot) zones 312 appear on the user during play of the 
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computer. The signal processing section 314 assesses the zones 312 for temperature 
and/ or size over the course of a computer game and generate a "gaine speed 
control" signal 320 which is communicated to the user's computer (i.e., that 
computer used in conjunction with the system 300 of Figure 9). The user's computer 
processes the signal 320 to increase or decrease the speed of a computer game in 
process on the computer. 

Those skilled in the art should appreciate that the IR camera 304 can be used 
without the features of the invention which assess user movement. Rather, this 
aspect should be considered stand-alone, if desired, to provide active feedback into 
gaming speed based upon user temperature and/or stress. Note that the camera 304 
can also be used to detect heartbeat since the zones 312 generally pulse at the user's 
heartbeat, so that heartbeat rate can also be considered as a parameter used in the 
generation of the game speed control signal 320. Alternatively, a pulse rate can be 
determined by known pulse rate systems that are physically connected to the user 
308. 



An IR lamp 324 can be used in system 300 to illuminate the user 308, with IR 
radiation 324a, such that sufficient IR illumination reflects off of the user 308 
whereby motion control of the cursor and/ or scene view can be made without the 
additional camera 306, The lamp 324 can be, and preferably is, made integrally with 
the section 302 to facilitate production packaging. 

Figure 9B shows process methodology of the invention to process thermal 
user images in accord with the preferred embodiment of the invention. Specifically, a 
system such as system 300 first acquires a thermal image map in process block 326. 
This image is compared to a reference image ("REF") in process block 327. REF can 
either be a temperature of the user (i.e., a temperature of one hot spot of a non- 
stressed user, or the temperature of one hot spot of the user at an initial, pre-game 
condition) or an amount of the area 312, Figure 9A, of the user in a non-stressed 
condition or initial pre-game condition). By way of example, REF can be an image 
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such as the frame 310 of Figure 9A. When the temperature or area of the region 312 
increases, the system 300 detects this change and determines that the image map 
exceeded the REF condition, as illustrated in process block 328. Should the map 
exceed the REF condition, the system 300 communicates this to the host processor 
which in turn adjusts the gaming speed, as desired. If the map does not exceed the 
REF condition, then the next IR image frame is acquired at block 326. 

System 300 and the process steps of Figure 9B are thus suitable to adjust 
gaming speed in real time, depending upon user stress level. In the preferred 
embodiment, the gaming speed is increased automatically such that the image map 
exceeds the REF signal for greater than about 50% of the time, so that all users, 
regardless of their ability, are pushed in the particular game. 

Those skilled in the art should appreciate that multi-camera embodiments of 
the invention can and preferably are incorporated into a common housing 338, such 
as shown in Figure 10. Further, as illustrated in Figure 10, cameras can also be made 
from detector arrays 340, processing electronics 342, and optics 344. Each camera 
340, 342, 344 is constructed to process the correct electromagnetic spectrum, e.g., IR 
(using, for example, germanium lenses 344 and microbolometer detectors 340). Each 
camera has its own field of view 350a, 350b and focal distance 352a, 352b to image at 
least a part of the user. These field of views 350 can overlap, to view the same area 
such as the user's face, or they can view separate locations, such as the user's head 
and hand. 

Cameras of the invention can also include a DSP section 356 such as described 
above to process user motion data. The DSP section 356 processes user motion data 
and sends difference information to the user's host computer. The host computer 
thereafter repositions the cursor and/or scene view based upon the difference 
information so that the user observes corresponding motion on the computer 
display, as described above. Accordingly, the DSP section need not reside within the 



30 



computer so long as difference information is isolated and communicated to the host 
computer CPU. 

Figure 11 illustrates frame capture by one camera of the invention to isolate 
zones of imaging according to expected motion patterns. In Figure 11, one frame 370 
of data for example covers the user's eyes 371, corresponding to one image zone; and 
another frame 372 of data can cover the user's head 373, corresponding to another 
image zone. As mentioned previously, preferably the frames 370, 372 are 64x64 
pixels each, or 256x256 (or higher 2 factorial set) to provide FFT capability on the 
image within the frame. A single camera can however provide both frames 370 and 
372, in accord with the invention. Specifically, a dense CCD detector array (e.g., 
480x740 pixels, 1000x1000 pixels, or higher) is used within the camera such that the 
whole array captures an image frame 376 of data, at least covering the available 
image format of the computer display 378. A matched filtering (or other image 
locate process) is processed on the frame 376 to locate the center 371a of the user's 
eyes (in the matched filtering process, an image data set of the user's eyes is stored in 
memory and correlated to the frame 376 such that a peak correlation is found at 
position 371a). Thereafter, a 64x64 array of data is centered about the eyes 371 to set 
the frame 370. To acquire the frame 372, every other pixel is discarded so that, again, 
a 64x64 array is set for the frame 372 (alternatively, each adjacent pair of pixels is 
added and averaged to provide a single number, again reducing the total number of 
pixels to 64x64). Note that this process is reasonable since the width of the eyes is at 
least y^ the width of the user's face. Nevertheless, further compression can be 
obtained by utilizing every third pixel (or averaging three adjacent pixels) to obtain 
a larger image area in the frame 372. Note that the compression in the width and 
length dimensions need not be the same. 

Framing of the information in Figure 11 can occur in several ways. Most 
cameras image at 30Hz so that image motion is smooth to the human eye. In one 
embodiment, one frame 370 is taken in between each frame 372, to minimize data 
throughput and processing; and yet to maintain dual processing of the two zones 
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imaged in Figure 11. Alternatively^ both frames 370^ 372 are processed concurrently 
since frame 376 is typically the 30Hz frame. 

Figure 11 also illustrates how framing can occur around the user's eyes 371 to 
acquire "blink" information to reset cursor control. A blink detected by the user's 
eyes in frame 370 (or other frame) can be used to (a) disable or enable control of 
cursor or scene movement based upon user control, or (b) simulate pick-up and 
replacement of the computer mouse (i.e., reinitializing movement in a particular 
direction). For example, by detecting a blink of the eyes 371, a system of the 
invention can disable human motion following control such as described herein. 
Another blink can be used to enable human motion following control. Blinking can 
also be used to continue motion in a particular direction. For example, movement of 
the cursor can be made to follow movement of the user's head, as described above. 
However, after a while, the person has to move to an uncomfortable position to keep 
moving the cursor (or scene). A blink can thus also serve to reposition the head back 
to a normal starting position so that further movement in the desired direction can 
, be made. 

Figure 12 illustrates a similar capture of a user's eyes 400, in accord with the 
invention. A frame 402 can thus be acquired by a camera of the invention. Figure 
12A illustrates further detail of one representative frame 402, illustrating that the 
user's pupils 404 are also captured. Figures 3 and 4 describe certain algorithms of the 
invention that are also applicable to motion of the user's pupils 404, as illustrated by 
left and right motion 406 and up and down motion 408. Accordingly, by zooming in 
on the user's eyes, another movement zone is created that causes repositioning of the 
cursor or the scene view based upon the movements 406, 408, much like the head 
movement described and illustrated in Figures 1-4. 

Note that the teachings of Figures 1-4 and 12-12A can be combined within a 
two zone movement system so that, for example, both head motion and pupil 
motion can be evaluated for image motion. The cursor and/ or scene view can be 
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repositioned, therefore, based upon movements from both zones. By way of 
example, repositioning of items within the display (e.g., the cursor and/or scene 
view) can be made when the head moves but not if the head and eyes move, which 
might indicate that the user is simply looking elsewhere in the room due to a 
distraction. However, if the user moves his head, but not his eyes, he is focussed on 
the game and intends rotation of the scene view, in another example. Other 
combinations are also possible. 

Cameras of the invention can also include zoom optics which (a) reduce or 
enlarge the image frame captured by a particular camera, or which (b) provide 
autofocus capability. Figure 13 shows one system 430 constructed according to the 
invention. A camera 432 includes camera electronics 432a and a zoom attachment 
432b. Data from the camera 432 is relayed to image and interpretation feedback 
electronics 434 for evaluation. For purposes of image magnification, the feedback 
electronics serve to evaluate a given image size relative to desired image goals. For 
example, to image the user's eyes with high fidelity might require liigh density of 
pixels at the user's eyes (e.g., at the zone 370, Figure 11). Accordingly, the system 430 
can isolate the user's eyes, such as described herein, and command the camera 
(through command lines 436) to increase or decrease magnification on the user's 
eyes so as to achieve desired resolution. The feedback electronics can also command 
motion of the camera to change its boresight alignment (i.e., to change where the 
camera image is centered) by commanding movement of the camera when resting on 
one or more linear drives 438, as known in the art. 

Once aligned on the desired user location, e.g., on the eyes with desired 
accuracy, the system 430 continues processing data such as described herein to create 
human interface control of items displayed on the user's computer, e.g., cursor 
and/ or scene view. Accordingly, processing section 440 operates to detect user 
motion and to communicate difference information to the user's computer, as 
described above. 
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With regard to autofocus, the system 430 of Figure 13 can also be used to 
process user motion based upon motion towards and away from the camera. Figure 
14 illustrates such a system, including a camera 450 with autofocus capability to find 
the best focus 452 relative to a user 454 within the field of view 456. For example, 
when the user 454 moves to position 460 (the user being shown in outline form 
454a), the new best focus has changed to 452a. The camera 450 provides a signal 450a 
to the image interpretation and feedback electronics 434, Figure 13, which indicates 
where the user is along the "z" axis from the camera 450 to the user 454. This signal 
450a is thus used much like the other motion signals described herein, to move the 
cursor and/ or scene view in response to such movements. Figure 14 A illustrates a 
representative scene view 462 when, for example, the user is at best focus 452. The 
scene view 462 includes a house image 464 with a door 465. When the user moves to 
position 460, the house and door 464', 465' of the scene view 462' enlarge, since the 
user moved closer to the camera 450. Such a motion might reveal, for example, 
additional objects witliin the house, such as illustrated by object 466, Figure 14B. 
Accordingly, the autofocus feature of the invention provides yet another degree of 
freedom in motion control, in accord with the invention. 

Image data, manipulation, and human interface control can be improved, 
over time, by using neural net algorithms. As shown in Figure 13, a neural net 
update section 435 can for example couple to the feedback electronics 434 so as to 
assimilate movement information and to improve data transmitted to the host 
computer, over time. Use of neural nets are known in the art. 

Figure 15 illustrates a frame of data 490 used in accord with the invention to 
implement a simplified left, right, up, down movement algorithm to control cursor 
movement and/ or scene view movement. Frame 490 is captured by a camera of the 
invention; and preferably the camera incorporates autofocus, as described above, to 
provide a crisp image of the user 492 regardless of her position within the camera's 
field of view. As shown, image frame 490 provides very sharp edges to the user's 
face, including a left edge 494a, right edge 494b, and chin 494c. These edges need 
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only approximate vertical or horizontal position. Movement of the user results in 
movement of the edges 494, such as shown in Figure 15A. Figure 15A shows that 
once such edges are acquired, they conveniently permit subsequent movement 
analysis and control of scene view and/ or cursor position. Specifically, Figure ISA 
shows movement of the user's "edges" from 494a-c to 494a-c', indicating that the 
user moved left (as viewed from the camera's position) and that her chin raised 
slightly, indicating that an upward tilt of the head. This information is assessed by 
the process sections as discussed above and relayed to the host computer as 
difference information to augment or provide cursor and/ or scene movement in 
response to the user's movement. 

Note that such edge movements roughly correspond to movement along 
rows and columns of the detector array. Detected movement from one row to 
another (or one colunrn to another) can readily calculate the actual motion of the 
user from information of the user's best focus position and from the focal length of 
the camera's lens. This information may then be used to set the magnification of 
movement of items in the computer display (e.g., cursor and/ or scene view). 

Figure 16 illustrates an image of one object 500 used in accord with the 
invention to provide image manipulation in response to motion of the object. The 
object 500 is held by the user 501 to manipulate motion of his cursor 502 and/ or 
scene view 504 on his computer display 506. The object 500 is used because it 
exhibits an optical shape that is easily recognized through image correlation (such as 
matched filtering). In accord with the invention, a camera 510 is used to image the 
object 500; and frames of data are sent to the frame processor 512. The processor 512 
determines image position - relative to a starting position - and thereafter 
communicates difference information to the user's computer 505 along data line 514. 
The difference information is used by the computer's CPU and operating system to 
reposition items on the display 506 in response to motion of the object 500. Almost 
any motion, including rotation, tilting and translation are accomplished with the 
object 500 relative to a start position. This start position can be triggered by the user 
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501 at the start of a game by commanding that the camera 510 take a reference frame 
("REF") that is stored in memory 513. The user 501 commands that REF imagery be 
taken and stored through the keyboard 505a, connected to the computer 505, which 
in turn commands the processor 512 and camera 510 to take the reference frame REF. 

Motion of the object 500 is thus made possible with enhanced accuracy by 
comparing subsequent frames of the object 500 with REF. Wlien motion of rotation, 
tilt or translation are detected (for example, by using the techniques of Figures 2-4, 8- 
8D), then repositioning of items (502, 504) on the display 506 are follow that 
movement. 

The techniques of the invention permit control of the scene view and/ or 
cursor on a computer screen by motion of one or more parts of the user's body. 
Accordingly, as shown in Figure 17, complete motion of the user 598 can be 
replicated, in the invention, by correlated motion of a action figure 599 within a 
game. In Figure 17, user 598 is imaged by a camera 602 of the invention; and frames 
from the camera 602 are processed by the process section 604, such as described 
herein. The user 598 is captured and processed, in digital imagery, and armotated 
with appropriate user segments, e.g., segments 1-6 indicating the user's hands, feet, 
head and main body. Motion of the segments 1-6 are communicated to the host 
computer 606 from the process section 604. The computer's operating system then 
updates the associated display 608 so that the action figure 599 (corresponding to an 
action figure within a computer game) moves like user 598. Accordingly, user 
motion of action figure 599 is made by the user 598 by performing stunts (e.g., 
striking and kicking) that he would like the action figure 599 to perform, such as to 
knock out an opponent within the display 608. 

Figure 18 illustrates a two camera system 700 used to determine translation 
and rotation. The forward viewing camera 702 observes the user's face 703 and 
determines the right-left (Axi) and up-down (Ayi) translation of the user's face 793. 
The top viewing camera 704 observes the top of the user's face or head 705 and 
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determines the right-left (AX2) and up-down motion (Ay2) of the user's face or head. 
The two cameras 702, 704 are each processed through motion sensing algorithms 
706 using the teachings above, and results are shown on the computer display 710. 
For purposes of illustration, the display" 710 shows an image of the user; while the 
image can be, for example, an action figure or other computer object, as desired. As 
indicated in Figure 18, the Aya can be directly applied to motion control of the user's 
forward and reverse motion (note, these motions are illustrated as within a 
computer display 710 as processed by algorithms 706). Ax2 can be directly applied to 
the users left-right sideways or strafe motion; Ayi can be directly applied to control 
the users up-down viewpoint, each as illustrated on display 710a. The results of the 
difference between Ax2 and Axi can be applied to control the user's left-right turn or 
viewpoint. 

The techniques of Figure 18 can be further extended to front, side and top 
view cameras for complete motion detection. The top camera determines the user's 
left-right, front-back motion while the front facing camera determines the user's 
rotational up-down, left-right motion. 

Figure 19 describes an algorithm to detect user eye blink. The video imagery 
is stored into a multiple frame buffer 800. The algorithm selects the current frame 
and a frame from the frame buffer and differences these frames using the adder 802. 
The difference frame consists of the pixel by pixel difference of the delayed frame 
and the current frame. The difference frame includes motion information used by 
the algorithms of teachings above. It also contains information on the user eye blink. 
The frames differenced by the adder 802 are separated temporally enough to ensure 
that one frame contains an image of the users face with the eyes open, the other 
image is of the user's face with the eyes closed. The difference image contains a two 
strong features, one for each eye. These features are spatially separated by the 
distance between the user's eyes. The blink detect function 808 inspects the image 
for this pair of strong features which are aligned horizontally and spaced within an 
expected distance based on the variation from one human face to another and the 
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variation in seating distance expected from user to user. The recognition of the blink 
features may be accomplished using a matched filter or by recognition of expected 
frequency peaks in the frequency domain at the expected spatial frequency for 
human eye separation. The blink detect function 708 identifies the occurrence of a 
blink to a controlling function to either disable the cursor motion or take some other 
action. 

Appendix A, pages A1-A27, provide non-limiting source code to illustrate 
certain features and operations of the invention. 

The invention thus attains the objects set forth above, among those apparent 
from the preceding description. Since certain changes may be made in the above 
methods and systems without departing from the scope of the invention, it is 
intended that all matter contained in the above description or shown in the 
accompanying drawing be interpreted as illustrative and not in a limiting sense. It is 
also to be understood that the following claims are to cover all generic and specific 
features of the invention described herein, and all statements of the scope of the 
invention which, as a matter of language, might be said to fall there between. 

Having described the invention, what is claimed is: 
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1. A human motion following controller for augmenting motion of items shown 
on a computer display, the display being coupled to a computer of the type which 
controls positioning of the items through operating system controls, comprising: 

a camera for capturing frames of data corresponding to an image of a first part of a 
user at the computer display; 

signal processing means coupled to the camera for (a) detecting differences between 
successive frames of data corresponding to motion of the first part, and (b) 
communicating differences information to the computer to reposition display of the 
items through operating system conta'ols, the items being repositioned on the display 
by an amount corresponding to the motion of the first part. 

2. A controller of claim 1, wherein the items comprise a computer cursor. 

3. A controller of claim 1, wherein the items comprise a scene view. 

4. A controller of claim 1, further comprising a PC card for installation within 
the computer and for corrununication on a computer bus, the signal processing 
means being substantially resident with the PC card for communicating differences 
information to the bus. 

5. A controller of claim 1, wherein the camera comprises means for capturing 
augmented frames of data corresponding to a second part of a user at the computer 
display, the signal processing means further comprising means for detecting 
differences between successive augmented frames of data corresponding to motion 
of the second part and for communicating augmented differences information to the 
computer to reposition display of the items through operating system controls, the 
items being repositioned on the display by an amount corresponding to motion of 
the first and second parts. 



39 



FIGURE 2 
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NON LIMITING SOURCE CODE FOR U.S. PROVISIONAL APPLICATION ENTITLED 
"HUMAN MOTION FOLLOWING COMPUTER MOUSE AND GAME CONTROLLER" by Robert 
D. Frey et al.. Filed September 11, 1998 



// cursor.cpp Demonstration program for video mouse. 
#include "glos.h" 

^include <stdio.h> 
5 ^include <sti-ing.h> 
^include <stdlib.h> 
#include <niath.h> 
^include <gl\glaux.h> 
^include "fftn.li" 

10 

^include "mvl.h" 

#defme NORMAL 0 
^define FRAMESIZE 64 
15 #derine ZOOM 2.0 
#define DISPLAY 1 

static void CALLBACK Key_q(void ); 
static void CALLBACK CursorControl(void ); 
20 void MTPROCALL caminten-upt(short); 
void InitializeCamera(void); 
void StartFrameGrab(void); 

void PositionCursor(double detectx, double detecty); 

25 int numPixels; 

unsigned char *caniBuffer; 

double *reDatal, *imDatal; 

double *reData2, *iniData2; 

double *reCorreIation , *imCon:elation; 
30 unsigned char *caniFrame; 



volatile long waitfor interrupt; 

35 void main(int argc, char **argv) 
{ 

int windW = ZOOM* FRAMESIZE; 
int windH = ZOOM*FRAMESIZE; 
40 auxInitPosition(0, 0, windW, windH); 

auxInitDispIayMode(AUX_RGB|AUX_DOUBLE); 

if (auxInitWindowC'Headmaus View") == GL_FALSE) { 
45 auxQuitQ; 
} 

glClearColor(0.0, 0.0, 0.0, 0.0); 
glPixelZoom(ZOOM, ZOOM); 

50 

InitializeCameraQ; 
auxKeyFunc(AUX_q, Key q); 

55 

// auxIdleFunc(NULL); 
auxMainLoop(CursorContTol); 

) 
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NON LIMITING SOUl!CE CODE FOR U.S. PROVISIONAL APPLICATION ENTITLED 
"HUMAN MOTION FOLLOWING COMPUTER MOUSE AND GAME CONTROLLER" by 
D. Frey et al., Filed September 11, 1998 



static void CALLBACK CursorControl(void) 
{ 

int k; 

int even=l; 

unsigned char temp = 0; 

int dims[2]={FRAMESI2E,FRAMESIZE}; 

double multiplier; 

while(l){ 

while(waitforinterrupt==0); 
waitforinteiTupt=l ; 
if (even){ 

even=0; 

for (k=0;k<numPixels;k++){ 

temp=camBuffer[numPixeIs-k-l]; 
reDatal[k] = (double) (camFrame[k]-temp); 
iroDatalfk] = 0.0; 
camFrame[k]=temp; 

} 

fftn(2,dims,reDatal, imData 1,1,0); 
for (k=0;k<nuniPixcls;k++){ 

reCorreIation[k] = reDatal[k]*reData2[k]+imDatal[k]*imData2[k]; 

imCorrelation[k] = imDatal[k]*reData2[k]-reDataI[k]*imData2[k]; 

} 

else { 

even=l; 

for (k=0;k<numPixeIs;k++){ 

temp=cainBuffer[nuniPixeIs-k- 1]; 
reData2[k] = (double) (camFrame[k]-temp); 
iniData2[k] = 0.0; 
camFrame[k]=temp; 

} 

fftn(2,dims,reData2, imData2, 1 ,0); 
for (k=0;k<nuniPixels;k++){ 

reCorrelation[k] = reData2[k]*reDatal[k]+imData2[k]* imData l[k]; 

imCorrelation[k] = imData2[k]*reDatal [k]-reData2 [k] * imData l [k]; 

} 

ffm(2,dims,reCorrelation, imCorreIation,-l,-l); 

double max=-200000.0; 
int maxindex=0; 

for (k=0;k<nuniPixe!s;k++){ 

if (reCorrelation[k] > max){ 
max=reCorrelation[k]; 
maxindex=k; 

} 

} 

gICIear(GL_COLOR_BUFFER_BIT); 

glDrawPixels(FRAMESI2E, FRAMESIZE, GL_LUMINANCE 

GL_UNSIGNED_BYTE,camFrame); 

glFlushQ; 

auxSwapBuffers(); 
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NON LIMITING SC5URCE CODE FOR U.S. PROVISIONAL APPLICATION ENTITLED 
"HUMAN MOTION FOLLOWING COMPUTER MOUSE AND GAME CONTROLLER" by Robert 
D. Frey et al.. Filed September 1 1, 1998 



if(max>5000.0){ 

double detecty=(doubIe)((int)(maxindex/FRAMESIZE)); 
double detectx=(double)(maxindex - detecty*FRAMESIZE); 

if (detectx > FRAMESIZE/2) detectx = detectx - FRAMESIZE; 
if (detecty > FRAMESIZE/2) detecty = FRAMESIZE - detecty; 
else detecty = -detecty; 

double vector=sqrt(detectx*detectx+detecty*detecty); 
if (vector<2) muitiplier=l; 

else if (vector<10) multiplier = exp((vector-2)/2.5); 
else multiplier=0; 

detectx*=multiplier; 
detecty*=multiplier; 

PositionCursor(detectx,detecty); 

} 

} 
} 

static void CALLBACK Key_q(void) 
{ 

MVlStopGrabQ; 

MVlIRQDisable(MVl_MXfer_Int); 
MVlDiscomiectlnterruptCallback(O); 

MVlUnlockMasterBuffer(camBuffer, numPixels); 

fft_free(); 

free(imDatal); 

free(reDatal); 

free(imData2); 

free(reData2); 

free(imCorrelation); 

free(reCorrelation); 

free(camBuffer); 

free(camFrame); 

auxQuitQ; 

} 

void MTPROCALL caminten:upt(short status) 



MVlIntenruptProcessed(O); 
waitfor interrupt = 0; 



void InitializeCamera(void) 
{ 

MVlGrabWindow grab Window; 
MVl Frame Frame; 

unsigned long camrows, camcols; 
int mvstatus=MV10pen(); 
mvstatus=MVlSetCurrentBoard(0); 

mvstatus=MVlLoadCameraConfigFile('T:\\mv-1000\\camcfg\\headmouse.ini","CamConfig"); 
MV 1 InquireMaxGrabWindo\vSize(&camcols.&camrows); 
numPixels = FRAMESIZE*FRAMESIZE; 

camBuffer = (unsigned char *)malloc(numPixels*sizeof(unsigned char)); 
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camFrame = (unsigned char *)malIoc(nuniPixels*sizeof(char)); 
//FFT Stuff 

reDatal = (double *) nialloc(numPixeIs*sizeof(doubIe)); 
imDatal = (double *) maIloc(nuniPixels*si2eof(double)); 

reData2 = (double *) malIoc(numPixels*sizeof(double)); 
iniData2 = (double *) malloc(nuniPixeIs*sizeof(double)); 
reCoirelation = (double *) malloc(nunTPixels*sizeof(double)); 
imCorr elation = (double *) malloc(numPixels*sizeof(double)); 
// end FFT Stuff 

int rowstart=canuows/2-FRAMESIZE/2; 
int colstart=camcols/2-FRAMESIZE/2; 

long windowaddress=Ox 10000; 

MVlSetGrabWindow(windowaddress, colstart, rowstart, FRAMESIZE, FRAMESIZE, 
&grabWindow) ; 

MVlCreateFrame(&Frame, 0, 0, FRAMESIZE, FRAMESIZE, &grab Window) ; 

MVlComiectInteniiptToCallback(0, (FARPROC) &caminterrupt, MVl_MXferJnt); 

MVlSelectTransferMode(MVl_Master_Mode | MV l_Master_Loop_Mode | 

MV l_Master_FrameEnd_Mode); 

M V 1 LockMasterBuffer(camBuffer, numPixels); 



MVlSetMasterModeTransfer(&Frame, camBuffer, numPixels, 0); 
M V 1 SetMasterCtlld(O); 

MVISelectTransferMode(MVl_Master_Mode | MVl_Master_Loop_Mode | 
MV l_Master_FrameEnd_Mode); 

M V URQEnabIe(M V l_MXfer_Tnt); 
MVlStartGrab(MVl_Cont_Grab, MVl_Grab_Even) ; 
waitforinterrupt = 1 ; 

} 

void PositionCursor(double detectx, double detecty) 
{ 

static POINT ptCursor; 
GetCursorPos(&ptCursor); 

ptCursor. x+=( long) detectx; 
ptCursor.y+=(long) detecty; 

SetCursorPos(ptCursor.x,ptCursor.y); 
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/* *-C-* -- * 

* File: 

* fftn.c 
+ 

* multivariate complex Fourier transform, computed in place 

* using mixed- radix Fast Fourier Transform algorithm. 
* 

* Fortran code by: 

* RC Singleton, Stanford Research Institute, Sept. 1968 

* NIST Guide to Available Math Software. 

* Source for module FFT from package GO. 

* Retrieved from NETLIB on Wed Jul 5 1 1:50:07 1995. 

* translated by f2c (version 19950721) and with lots of cleanup 

* to make it resemble C by: 

* MJ Olesen, Queen's University at Kingston, 1995-97 
*/ 

/*{{{ Copyright: */ 
/* 

* Copyright(c) 1995,97 Mark Olesen <olesen@me.QueensU.CA> 

* Queen's Univ at Kingston (Canada) 
+ 

* Permission to use, copy, modify, and distribute this software for 

* any purpose without fee is hereby granted, provided that this 

* entire notice is included in all copies of any software which is 

* or includes a copy or modification of this software and in all 

* copies of the supporting documentation for such software. 
* 

* THIS SOFTWARE IS BEING PROVIDED "AS IS", WITHOUT ANY EXPRESS OR 

* IMPLIED WARRANTY. IN PARTICULAR, NEITHER THE AUTHOR NOR QUEEN'S 

* UNIVERSITY AT KINGSTON MAKES ANY REPRESENTATION OR WARRANTY OF ANY 

* KIND CONCERNING THE MERCHANTABILITY OF THIS SOFTWARE OR ITS 

* FITNESS FOR ANY PARTICULAR PURPOSE. 
* 

* All of which is to say that you can do what you like with this 

* source code provided you don't try to sell it as your own and you 

* include an unaltered copy of this message (including the 

* copyright). 
* 

* It is also implicitly understood that bug fixes and improvements 

* should make their way back to the general Internet community so 

* that eveiyone benefits. 

*/ 

/*}}}*/ 
/*{{{ notes: */ 
/* 

* Public: 

* fft_free 
fftn / fftnf 

* (these are documented in the header file) 
* 

* Private: 

* fftradix / fftradixf 



' int fftradix (REAL Re[], REAL Im[], size_t nTotal, size_t nPass, 
^ ' size_t nSpan, int iSign, size_t maxFactors, 

' size t maxPerm); 

' RE and IM hold the real and imaginary components of the data, and 
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* return the resulting real and imaginary Fourier coefficients. 

* Multidimensional data *must* be allocated contiguously. There is 

* no limit on the number of dimensions. 
* 

* 

* Although there is no limit on the number of dimensions, fftradixQ 

* must be called once for each dimension, but the calls may be in 

* any order. 
* 

* NTOTAL = the total number of complex data values 

* NPASS = the dimension of the cuirent variable 

* NSPAN/NPASS = the spacing of consecutive data values while indexing 

* the cuiTent variable 

* ISIGN - see above documentation 
* 

* example: 

* tri-variate transform with Re[nl][n2][n3], rm[nl][n2][n3] 

* fftradix (Re, Im, nl*n2*n3, nl, nl, 1, maxf, maxp); 

* fftradix (Re, Im, nl*n2*n3, n2, , nl*n2. I, maxf, maxp); 

* fftradix (Re, Im. nl*n2*n3, n3. nl*n2*n3, I. maxf, maxp); 

* single-variate transform, 

* NTOTAL = N = NSPAN = (number of complex data values), 

* fftradix (Re, Im. n, n, n, 1, maxf, maxp); 
* 

* The data can also be stored in a single array with alternating 

* real and imaginary parts, the magnitude of ISIGN is changed to 2 

* to give correct indexing increment, and data [0] and data [1] used 

* to pass the initial addresses for the sequences of real and 

* imaginary values, 

* example: 

* REAL data [2*NTOTAL] ; 

* fftradix (&data[0], &data[l], NTOTAL, nPass, nSpan, 2, maxf, maxp); 

* for temporary allocation: 
* 

* MAXFACTORS >= the maximum prime factor of NPASS 

* MAXPERM >= the number of prime factors of NPASS. In addition, 

* if the square-free portion K of NPASS has two or more prime 

* factors, then MAXPERM >= (K- 1 ) 

* storage in FACTOR for a maximum of 15 prime factors of NPASS. if 

* NPASS has more than one square-free factor, the product of the 

* square-free factors must be <= 210 array storage for maximum prime 

* factor of 23 the following two constants should agree with the 

* array dimensions, 

*/ 

/*}}}■'■/ 

/*{{{ Revisions: */ 
/* 

* 26 July 95 JohnBeale 

* - added maxf and maxp as parameters to fftradix() 

* 28 July 95 Mark Olesen <olesen@me.QueensU.CA> 

* - cleaned-up the Fortran 66 goto spaghetti, only 3 labels remain. 
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* - added fft_free() to provide some measure of control over 

* allocation/deallocation, 
* 

* - added fftn() wrapper for multidimensional FFTs 

* - use -DFFT_NOFLOAT or -DFFT_NODOUBLE to avoid compiling that 

* precision. Note suffix 'f on the function names indicates 

* float precision. 
* 

* - revised documentation 
* 

* 3 1 July 95 Mark Olesen <olesen@me.QueensU.CA> 

* - added GNU Public License 

* - more cleanup 

* - define SUN_BROKEN_REALLOC to use mallocQ instead of realloc() 

* on the first pass tluough, apparently needed for old libc 

* - removed #error directive in favour of some code that simply 

* won't compile (generate an error that way) 
* 

* 1 Aug 95 Mark Olesen <olesen@me.QueensU.CA> 

* - define FFT_RADIX4 to only have radix 2. radix 4 ti-ansfomis 

* - made fftradix /fftradixf () static scope, just use fftn() 

* instead. If you have good ideas about fixing the factors 

* in fftnQ please do so. 
* 

* 8 Jan 95 mj olesen <olesen@me.QueensU.CA> 

* - fixed typo's, including one that broke scaling for scaling by 

* total number of matrix elements or the square root of same 

* - removed unnecessary casts from allocations 
* 

* 10 Dec 96 mj olesen <oIesen@rae.QueensU.CA> , 

* - changes defines to compile * without* float support by default, 

* use -DFFT_FLOAT to enable. 

* - shifted some variables to local scope (better hints for optimizer) 

* - added Michael Steffens <Michael.Steffens@mbox.muk.uni-hannover.de> 

* Fortran 90 module 

* - made it simpler to pass dimensions for ID FFT. 
* 

* 23 Feb 97 Mark Olesen <oIesen@me.QueensU.CA> 

* - removed the GNU Public License (see 21 July 1995 entry), 

* which should make it clear why I have the right to do so. 

* - Added copyright notice and submitted to netlib 

* - Moved documentation for the public functions to the header 

* files where is will always be available. 

*/ 

/*}}}*/ 

#ifndef_FFTN_C 
#defme _FFTN_C 

/* we use CPP to re-include this same file for double/fioat cases */ 

#if 'defined (lint) && .'defined ( FILE ) 

Error: your compiler is sick! define FILE yourself (a string) 

eg. something like -D FILE =\"fftn.c\" 

#endif 

^include <stdio.h> 
^include <stdlib.h> 
^include <math.h> 
#include "fftn.h" 

/*{{( defines/constants */ 
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#ifndefM_PI 

# define M_PI 3.141 59265358979323846264338327950288 
#endif 



#ifndefSIN60 

# define SIN60 0.86602540378443865 /* sin(60 deg) */ 

# define COS72 0.30901699437494742 /* cos(72 deg) */ 

# define SIN72 0.95105651629515357 /* sm(72 deg) */ 
#endif 

/*}}}*/ 



/* { { { static parameters - for memory management */ 
static size_t SpaceAlloced = 0; 
static size_t MaxPemiAlloced = 0; 



/* temp space, (void *) since both float and double routines use it */ 
static void * TmpO = NULL; /* temp space for real part */ 

/* temp space for imaginary part */ 
/* temp space for Cosine values */ 
/* temp space for Sine values */ 



static void * Tmpl = NULL; 
static void * Tmp2 = NULL; 
static void * Tmp3 = NULL; 



static int * Perm = NULL;/* Permutation vector */ 



#define NFACTOR 1 1 

static int factor [NFACTOR]; 
/*}}}*/ 

/*{{{fft_freeO */ 
void 

fft_free (void) 
{ 

SpaceAlloced = MaxPemiAlloced = 0; 
if (TmpO) { free (TmpO); TmpO = NULL; } 
if (Tmpl) { free (Tmpl); Tmpl =NULL; } 
if (Tmp2) { free (Tmp2); Tmp2 = NULL; } 
if (Tmp3) { free (Tmp3); Tmp3 = NULL; } 
if (Perm) { free (Perm); Perm = NULL; } 

) 

/*}}}*/ 

/* return the number of factors */ 
static int 

factorize (int nPass, int * kt) 
{ 

int nFactor = 0; 
intjjj; 

*kt = 0; 

/* determine the factors of n */ 
while ((nPass % 16) == 0) /* factors of 4 */ 

{ 

factor [nFactor++] =4; 
nPass /= 16; 

} 

j = 3;jj = 9; /* factors of 3, 5, 7, ... */ 

do { . 
while ((nPass % jj) == 0) 
{ 

factor [nFactor++] = j; 
nPass /= jj; 
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} 

j+=2; 

*j; 

} while (jj <= nPass); 
5 if(nPass<=4) 
{ 

*kt = nFactor; 
factor [nFactor] = nPass; 
if (nPass != 1) 
10 nFactor++; 
} 

else 
{ 

if (iiPass - (nPass / 4 « 2) = 0) 
15 { 

factor [nFactor++] ~ 2; 
iiPass /= 4; 

} 

*kt = nFactor; 
20 j = 2; 

do { 

if((nPass%j) = 0) 
{ 

factor [nFactor++] = j; 
25 iiPass/=j; 
} 

j = ((i + l)/2«l)+l; 
} while (j <= nPass); 

} 

30 if(*kt) 
{ 

j = *kt; 
do 

factor [nFactor++] = factor 
35 while G); 

} 



return nFactor; 

} 

40 

/* re-include this source file on the second pass through */ 
/*{{{ defines for re-including double precision */ 
#ifdefFFT_NODOUBLE 
#ifndefFFT_FLOAT 
45 # define FFT_FLOAT 

# endif 
#else 

#undef REAL 
#undefFFTN 
50 #undefFFTNS 

#undefFFTRADIX 
#undefFFTRADIXS 
/* defines for double */ 

# define REAL double 
55 # define FFTN fftn 

n define FFTNS "fftn" 
n define FFTRADIX fftradix 

# define FFTRADIXS 'Tftradix" 
/* double precision routine */ 
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static int 

fftradix (double Re[], double Im[], 

size t nTotal, size t nPass, size t nSpan, int isign, 
int maxFactors, int maxPerm); 
5 # include _FILE__ /♦ include this file again */ 

#endif 
/*}}}*/ 



/* { { { defines for re-including float precision */ 
10 #ifdef FFT_FLOAT 

#undefREAL 

#undefFFTN 

#undefFFTNS 

#undef FFTRADIX 
15 #undefFFTRADIXS 

/* defines for float */ 

# define REAL - float 

# define FFTN fftnf /* trailing T for float */ 

# define FFTNS "fftnf , /* name for error message */ 
20 # define FFTRADIX fftradixf /* trailing T for float */ 

# define FFTRADIXS "fftradixf /* name for error message */ 
/* float precision routine */ 

static int 

fftradixf (float Re[], float Im[], 
25 size_t nTotal, size_t nPass, size_t nSpan, int isign, 

int maxFactors, int maxPerm); 

# include __FILE__ /* include this file again */ 
#endif 

/*}}}*/ 

30 #else /* _FFTN_C */ 
/* 

# use macros to access the Real/Imaginary parts so that it's possible 

# to substitute different macros if a complex struct is used 
35 */ 

#ifndefRe_Data 

# define Re_Data(i) Re[i] 

# define Im_Data(i) Im[i] 
40 #endif 



*/ 

45 int 

FFTN (int ndim, 

const int dims [], 

REAL Re (], 

REAL Im [], 
50 int iSign, 

double scahng) 

{ 

size_t nTotal; 

int maxFactors, maxPerm; 

55 

/* 

* tally the number of elements in the data array 

* and determine the number of dimensions 
*/ 
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nTotal = 1 ; 
if (ndim) 
{ 

if (dims != NULL) 
{ 

int i; 

/* number of dimensions was specified */ 
for (i = 0; i < ndim; i++) 
{ 

if (dims [i] <= 0) goto Dimension Error; 
nTotal *= dims [i]; 

} 

} 

else 

nTotal *- ndim; 

} 

else 
{ 

int i; 

/* determine # of dimensions from zero- terminated list */ 
if (dims == NULL) goto Dimension_Error; 
for (ndim = i = 0; dims [i]; 
{ 

if (dims [i] <= 0) 

goto Dimension Error; 
nTotal *= dims [i]; 
ndim++; 

} 

} 

/* determine maximum number of factors and permuations */ 

mn 

/* 

* follow John Beale's example, just use the largest dimension and don't 

* worry about excess allocation. May be someone else will do it*? 
*/ 

if (dims NNULL) 
{ 

int i; 

for (maxFactors = maxPerm = 1, i = 0; i < ndim; i++) 

if (dims [i] > maxFactors) maxFactors = dims [i]; 
if (dims [i] > maxPenn) maxPerm = dims [i]; 

} 

} 

else 
{ 

maxFactors = maxPenn = nTotal; 

} 

#else 

/* use the constants used in the original Fortran code */ 
maxFactors = 23; 
maxPerm = 209; 
#endif 

/* loop over the dimensions: */ 
if (dims != NULL) 
{ 

size_t nSpan = 1; 
int i; 
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for (i = 0; i < ndim; i++) 
{ 

int ret; 

5 nSpan *= dims [i]; 

ret = FFTRADIX (Re, Ini, nTotal, dims [i], nSpan, iSign, 

maxFactors, maxPemi); 
/* exit, clean-up already done */ 
if (ret) 

1 0 return ret; 

} 

) 

else 
{ 

15 int ret; 

ret = FFTRADIX (Re, Im, nTotal, nTotal, nTotal, iSign, 

maxFactors, maxPerm); 
/* exit, clean-up already done */ 
if (ret) 

20 return ret; 

} 

/* Divide through by the normalizing constant: */ 
if (scaling && scaling != 1.0) 
25 { 

int i; 

if (iSign < 0) iSign = -iSign; 
if (scaling < 0.0) 

30 scaling = (scaling < - 1 .0) ? sqrt (nTotal) ; nTotal; 

scaling = 1 .0 / scaling; /* multiply is often faster */ 
for (i = 0; i < nTotal; i += iSign) 
{ 

Re_Data (i) *= scaling; 
35 Im_Data (i) *= scaling; 

} 

} 

return 0; 



40 Dimension_Error: 

fprintf (stderr, "Error: " FFTNS "() - dimension error\n"); 
fft_free (); /* free-up memory */ 
return -1; 

} 

45 

/* */ 

/* 

* singleton's mixed radix routine 

50 * could move allocation out to fftn(), but leave it here so that it's 

* possible to make this a standalone function 
*/ 

static int 

FFTRADIX (REAL Re [], 
55 REAL Im [], 

sizet nTotal, 
size_t nPass, 
sizet nSpan, 
int iSign, 
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int maxFactors, 
int maxPemi) 

{ 

int ii, iiFactor, kspan, ispan, inc; 

int j, jc, jf, jj, k, kl, k3, kk, kt, mi, ns, nt; 



REAL radf; 
REALcl, c2, c3, cd; 
REAL si, s2, s3, sd; 



REAL * Rtmp = NULL; 
REAL * Itmp = NULL; 
REAL * Cos = NULL; 
REAL * Sin = NULL; 



/* temp space for real part*/ 

/* temp space for imaginary part */ 

/* Cosine values */ 

/* Sine values */ 



#ifndefFFT_RADIX4 

REAL s60 = S1N60; 

REALs72 = SrN72; 

REAL c72 = COS72; 
#endif 

REAL pi2 = M_PI; 



/* sin(60 deg) */ 
/* sin(72 deg) */ 
/* cos(72 .deg) */ 

/* use PI first, 2 PI later */ 



/* gcc complains about k3 being uninitialized, but I can't find out where 

* or why ... it looks okay to me. 

* initialize to make gcc happy 
*/ 

k3 =0; 

/* gcc complains about c2, c3, s2,s3 being uninitialized, but they're 

* only used for the radix 4 case and only AFTER the (si = 0.0) pass 

* through the loop at which point they will have been calculated. 
* 

* initialize to make gcc happy 
*/ 

c2 = c3 = s2 = s3 = 0.0; 

/* Parameter adjustments, was fortran so fix zero-offset */ 

Re--; 

Im-; 



if (nPass < 2) 
return 0; 



/* allocate storage */ 

if (SpaceAUoced < maxFactors * sizeof (REAL)) 
{ 

#ifdef SUN_BROKEN_REALLOC 

if (! SpaceAUoced) ' /* first time */ 

{ 

SpaceAUoced = maxFactors * sizeof (REAL); 
TmpO = malloc (SpaceAUoced); 
Tmpl = malloc (SpaceAUoced); 
Tmp2 = malloc (SpaceAUoced); 
Tmp3 = malloc (SpaceAUoced); 

} 

else 
{ 

#endif 
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SpaceAUoced = maxFactors * sizeof (REAL); 
TmpO = realloc (TmpO, SpaceAUoced); 
Tmpl = realloc (Tmpl, SpaceAUoced); 
Tmp2 = realloc (Tmp2, SpaceAUoced); 
Tmp3 = realloc (Tmp3, SpaceAUoced); 
#ifdef SUN_BROKEN_RE ALLOC 
} 

#endif 
} 

else 
{ 

/* allow full use of alloc'd space */ 
maxFactors = SpaceAUoced / sizeof (REAL); 

} 

if (MaxPermAlloced < maxPerm) 
{ 

#ifdef SUN_BROKEN_REALLOC 

if (.'MaxPermAlloced) /* first time */ 
Perm = malloc (maxPerm * sizeof(int)); 
else 

#endif 

Perm = realloc (Penn, maxPerm * sizeof(int)); 
MaxPemxAlloced = maxPerm; 

} 

else 

{ 

/* allow full use of alloc'd space */ 
maxPerm = MaxPermAlloced; 

} 

if (!TmpO II ITmpl || !Tmp2 || !Tmp3 || !Perm) goto Memory_Error; 

/* assign pointers */ 
Rtmp = (REAL *) TmpO; 
rtmp = (REAL*) Tmpl; 
Cos = (REAL *) Tmp2; 
Sin =(REAL*)Tmp3; 

/* 

* Function Body 
*/ 

inc = iSign; 
if(iSign<0) 
{ 

#ifndef FFT_RADIX4 
s60 = -s60; 
s72 = -s72; 

■ #endif 

pi2 = -pi2; 

inc = -inc; /* absolute value */ 

} 

/* adjust for strange increments */ 
nt = inc * nTotal; 
ns = inc * nSpan; 
kspan = ns; 

nn = nt - inc; 
jc = ns / nPass; 
radf=pi2 * (double) jc; 



Appendix Page A14 





NON LIMITING SOURCE CODE FOR U.S. PROVISIONAL APPLICATION ENTITLED 
"HUMAN MOTION FOLLOWING COMPUTER MOUSE AND GAME CONTROLLER" by 
D. Frey et al.. Filed September 11, 1998 



ii = 0; 
jf=0; 

/* determine the factors of n */ 

iiFactor = factorize (iiPass, &kt); 
/* test that nFactors is in range */ 
if (nFactor > NFACTOR) 
{ 

fprintf (stderr, "Error: " FFTRADIXS "() - exceeded number of factors\n"); 
goto Memory_EiTor; 

} 

/* compute fourier transfonn */ 
for(;;){ 

sd = radf / (double) kspan; 

cd = sin (sd); 

cd = 2.0*cd*cd; 

sd = sin (sd + sd); 

kk= 1; 

ii++; 

switch (factor [ii - 1]) { 
case 2: 

/* transform for factor of 2 (including rotation factor) */ 
kspan /= 2; 
kl = kspan + 2; 
do { 
do { 

REAL tmpr; 

REAL tmpi; 

int k2; 

k2 = kk + kspan; 
tmpr = Re Data (k2); 
tmpi = Im_Data (k2); 
Re_Data (k2) = Re_Data (kk) - tmpr; 
Im_Data (k2) = Im_Data (kk) - tmpi; 
Re Data (kk) += tmpr; 
Im Data (kk) += tmpi; 
kk = k2 + kspan; 
} while (kk <= nn); 
kk -= nn; 
} while (kk <=jc); 
if(kk> kspan) 

goto Permute_Results; /* exit infinite loop */ 
do { 
int k2; 

cl = 1.0 -cd; 
si =sd; 
do { 

REAL tmp; 

do { 



pi2 *=2.0; 



/* use 2 PI from here on */ 



do { 
REAL tmpr; 
REAL tmpi; 
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k2 = kk + kspan; 

tmpr = Re_Data (kk) - Re Data (k2); 
tmpi = Im_Data (kk) - Im_Data (k2); 
Re_Data (kk) += Re_Data (k2); 
Im_Data (kk) += Im_Data (k2); 
Re Data (k2) = cl * tmpr - si * tmpi; 
Im_Data (k2) = si * tmpr + cl * tmpi; 
kk = k2 + kspan; 
} while (kk < nt); 
k2 = kk - nt; 
cl =-cl; 
kk = kl -k2; 
} while (kk > k2); 
tmp = cl -(cd*cl +sd*sl); 
si =sd*cl -cd*sl + sl; 
cl = 2.0 - (tmp * tmp + si * si); 
si *=cl; 
cl *=tmp; 
kk+=jc; 
} while (kk < k2); 
kl += (inc + iiic); 
kk = (kl - kspan) /2 +jc; 
} while (kk <=jc + jc); 
break; 

case 4 : /* transform for factor of 4 */ 

ispan = kspan; 
kspan /= 4; 

do { 
cl = 1.0; 
si =0.0; 
do { 
do { 

REAL ajm, ajp, a km, akp; 
REAL bjm, bjp, bkm, bkp; 
int k2; 

kl = kk + kspan; 
k2 =kl + kspan; 
k3 = k2 + kspan; 

akp = Re_Data (kk) + Re^Data (k2); 
akm = Re Data (kk) - Re_Data (k2); 

ajp = Re_Data (kl) + Re_Data (k3); 
ajm = Re Data (kl) - Re_Data (k3); 

bkp = Im_Data (kk) + Im Data (k2); 
bkm = Im_Data (kk) - Ini_Data (k2); 

bjp = Im_Data (k 1 ) + Im_Data (k3) ; 
bjm = Im_Data (kl) - Im Data (k3); 

Re_Data (kk) = akp + ajp; 
Im Data (kk) = bkp + bjp; 
ajp = akp - ajp; 
bjp = bkp - bjp; 
if(iSign<0) 
( 
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akp = akm + bjm; 
bkp = bkm - ajm; 
akm -= bjm; 
bkm += ajm; 

} 

else 

{ 

akp = akna - bjm; 
bkp = bkm + ajm; 
akm += bjm; 
bkm -= ajm; 

} 

/* avoid useless multiplies */ 
if (si =0.0) 
{ 

Re_Data (kl) = akp; 
Re_Data (k2) = ajp; 
Re_Data (k3) = akm; 
Im_Data (kl)=bkp; 
Im_Data (k2) = bjp; 
Im_Data (k3) = bkm; 

} 

else 
{ 

Re__Data (kl) = akp * cl - bkp * si; 
Re_Data (k2) = ajp * c2 - bjp * s2; 
Re_Data (k3) = akm * c3 - bkm * s3; 
Im^Data (kl) = akp * s 1+ bkp * c 1 ; 
Im_Data (k2) = ajp * s2 + bjp * c2; 
Im_Data (k3) = akm * s3 + bkm * c3; 

} 

kk = k3 + kspan; 
} while (kk <= nt); 

c2 = cl -(cd*cl +sd* si); 
si =sd*cl -cd*sl +sl; 
cl =2.0 -(c2 ♦ c2 + sl * si); 
si *=cl; 
cl *=c2; 

/* values of c2, c3, s2, s3 that will get used next time */ 
c2 = cl * cl -si * si; 
s2=2,0*cl *sl; 
c3 =c2 * cl -s2 * si; 
s3 =c2 * si +s2 * cl; 
kk = kk - nt + jc; 
} while (kk <= kspan); 
kk = kk - kspan + inc; 
} while (kk<=jc); 
if (kspan =jc) 
goto Permute_Results; /* exit infinite loop */ 
break; 

default: 

/* transform for odd factors */ 
#ifdef FFT_RADIX4 

• fprintf (stderr, "Error: " FFTRADIXS "(); compiled for radix 2/4 onlyVn"); 
fft_free (); /* free-up memory */ 

return -1; 
break; 
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#else /* FFT_RADIX4 */ 
ispan = kspan; 
k = factor [ii - 1]; 
kspan /= factor [ii - 1]; 

switch (factor [ii - 1]) { 

case 3: /* transform for factor of 3 (optional code) */ 
do { 
do { 

REAL aj, tmpr; 
REAL bj, tnipi; 
int k2; 

kl = kk + kspan; 

k2 = kl + kspan; 

tmpr = Re_Data (kk); 

tmpi = Im Data (kk); 

aj = Re_Data (k 1) + Re_Data (k2); 

bj = Im Data (kl) + Ini_Data (k2); 

Re_Data (kk) = tmpr + aj; 

Im_Data (kk) = tmpi + bj; 

tmpr -= 0.5 * aj; 

tmpi -= 0.5 * bj; 

aj = (Re_Data (kl) - Re_Data (k2)) * s60; 

bj = (Im_Data (kl) - Im Data (k2)) * s60; 

Re Data (kl) = tmpr - bj; 

Re_Data (k2) = tmpr + bj; 

Im_Data (kl) = tmpi + aj; 

Im Data (k2) = tmpi - aj; 

kk = k2 + kspan; 
} while (kk < nn); 
kk -= nn; 
} while (kk <= kspan); 
break; 

case 5: /* transform for factor of 5 (optional code) */ 
c2 = c72 * c72 - s72 * s72; 
s2 = 2.0 * c72 * s72; 
do { 
do ( 

REAL aa, aj, ak, ajm, ajp, akm, akp; 
REAL bb, bj, bk, bjm, bjp, bkm, bkp; 
int k2, k4; 

kl = kk + kspan; 
k2 = kl + kspan; 
k3 = k2 + kspan; 
k4 =k3 + kspan; 

akp = Re_Data (kl) + Re_Data (k4); 
akm = Re_Data (kl) - Re_Data (k4); 
bkp = Im_Data (kl) + Im_Data (k4); 
bkm = Im_Data (kl) - Im_Data (k4); 
ajp = Re_Data (k2) + Re_Data (k3); 
ajm = Re_Data (k2) - Re_Data (k3); 
bjp = Im_Data (k2) + Im_Data (k3); 
bjm = Im_Data (k2) - Im Data (k3); 
aa = Re_Data (kk); 
bb = Im_Data (kk); 
Re Data (kk) = aa + akp + ajp; 



Appendix Page A18 



NON LIMITING SOTOCE CODE FOR U.S. PROVISIONAL APPnTATION ENTITLED 
"HUMAN MOTION FOLLOWING COMPUTER MOUSE AND GAME CONTROLLER" by 
D. Frey et al., Filed September 11, 1998 



Ini_Data (kk) = bb + bkp + bjp; 

ak = akp * c72 + ajp * c2 + aa; 

bk = bkp * c72 + bjp * c2 + bb; 

aj = akm * s72 + ajm * s2; 

bj = bkm * s72 + bjm * s2; 

Re_Data (kl) = ak-bj; 

Re_Data (k4) = ak + bj; 

Im_Data(kl) = bk + aj; 

Im_Data (k4) = bk-aj; 

ak = akp * c2 + ajp * c72 + aa; 

bk = bkp * c2 + bjp * c72 + bb; 

aj = akm * s2 - ajm * s72; 

bj = bkm * s2 - bjm * s72; 

Re_Data(k2) = ak-bj; 

Re_Data (k3) = ak + bj; 

Im_Data (k2) = bk + aj; 
Im_Data (k3) = bk - aj; 
kk = k4 -f kspan; 
} while (kk < nn); 
kk -= nn; 
} while (kk <= kspan); 
break; 

default: 
k ~ factor [ii - Ij; 
ifO*f !=k) 
{ 

jf=k; 

si = pi2 / (double) jf; 

cl = cos (si); 

si = sin (si); 

if O f > maxFactors) 

goto Memory_Error; 
Cos[jf- I] = 1.0; 
Sin[jf- 1] = 0.0; 

j = i; 

do { 

Cos U - 1] = Cos [k - 1] * cl + Sin [k - I] * si; 
Sin [j - 1] = Cos [k - 1] * si - Sin [k - 1] * cl; 

k— ; 

Cos[k- l] = Cos[j- 1]; 
Sin [k- I] = -Sin [j - 1]; 

} while Q < k); 

} 

do { 
do { 

REAL aa, ak; 
REAL bb, bk; 
int k2; 

aa = ak = Re_Data (kk); 
bb = bk = Im_Data (kk); 

kl =kk; 

k2 = kk + ispan; 

j=i; 

kl += kspan; 
do { 
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k2 -= kspan; 

Rtmp [j] = Re_Data (kl) + Re_Data (k2); 
ak += Rtmp [j]; 

Itmp [j] = Im Data (kl) + Im_Data (k2); 
bk += Itmp U]; 

Rtmp [j] = Re Data (kl) - Re_Data (k2); 
Itmp [j] = Im_Data (kl) - Im_Data (k2); 

kl += kspan; 
} while (kl<k2); 
Re_Data (kk) = ak; 
Im_Data (kk) = bk; 

kl =kk; 

k2 = kk + ispan; 

j = i; 
do { 

REAL aj = 0.0; 

REAL bj = 0.0; 

kl += kspan; 
k2 -= kspan; 

ak ■ aa; 
bk - bb; 
k= 1; 
do { 

ak Rtmp [k] * Cos [jj - 1]; 
bk += Itmp [k] * Cos [jj - 1]; 
k++; 

aj += Rtmp [k] * Sin (jj - 1]; 
bj += Itmp [k] * Sin [jj - 1]; 
k++; 

jj += j; 

if(jj>jf) 

jj-=jf; 

} while (k<jf); 
k = jf-j; 

Re_Data (kl) = ak-bj; 
Im_Data (kl) = bk + aj; 
Re_Data (k2) = ak + bj; 
Im_Data (k2) = bk - aj; 

j++; 

} while (j < k); 
kk += ispan; 
} while (kk <= nn); 
kk -= nn; 
} while (kk <= kspan); 
break; 

} 

/* multiply by rotation factor (except for factors of 2 and 4) */ 
if (ii = nFactor) 

goto Permute Results; /* exit infinite loop */ 
kk =jc + 1; 
do { 

c2= 1.0-cd; 

si = sd; 

do { 
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cl =c2; 
s2 = sl; 
kk += kspan;. 
do { 

REAL tmp; 
do { 

REAL ak; 

ak = Re_Data (kk); 

Re_Data (kk) = c2 * ak - s2 * Im_Data (kk); 
Im_Data (kk) = s2 * ak + c2 * Im_Data (kk); 
kk += ispan; 
} while (kk <= nt); 
tmp = si * s2; 
s2 = sl *c2 + cl *s2; 
c2 = cl * c2 - tmp; 
kk = kk - nt + kspan; 
} while (kk <= ispan); 
c2 = cl-(cd*cl+sd*sl); 
si += sd*cl -cd*sl; 
cl =2.0 -(c2 * c2 + sl * si); 
si *=cl; 
c2 *-cl; 

kk = kk - ispan + jc; 

} while (kk <= kspan); 

kk = kk - kspan + jc + inc; 
} while (kk <=jc + jc); 
break; 

#endif /* FFT_RADIX4 */ 
} 

} 



/* permute the results to normal order ~ done in two stages */ 
/* permutation for square factors of n */ 
Permute_Results: 

Perm [0] = ns; 

if(kt) 
{ 

int k2; 

k = kt + kt+ 1; 
if (k>nFactor) 
k-s 

Perm [k] = jc; 

j = i; 

do { 

Perm [j] = Penn [} - 1] / factor U - 1]; 
Perm [k - 1] = Perm [k] * factor (j - H; 

k-s 

} while (j < k); 
k3 - Pei-m [k]; 
kspan = Perm [1]; 
kk=jc+ 1; 
k2 = kspan + 1 ; 

if(nPass != uTotal) 
{ 

/* permutation for multivariate transfonn */ 
Pennute^Multi: 
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do { 
k: = kk +jc; 
do { 
/* swap 

* Re_Data (kk) o Re_Data (k2) 

* Im_Data (kk) o Im Data (k2) 
*/ 

REAL tmp; 

tmp = Re_Data (kk); Re_Data (kk) = Re_Data (k2); Re__Data (k2) = tmp; 
tmp = Im_Data (kk); Im_Data (kk) = Im Data (k2); Im_Data (k2) = tmp; 
kk += inc; 
k2 += inc; 
} while (kk < k); 
kk += (ns - jc); 
k2 +- (ns - jc); 
} while (kk < nt); 
k2 = k2 - nt + kspan; 
kk = kk-nt+jc; 
} while (k2 < ns); 
do { 

do { 

k2-=Perm[j- 1]; 
j++; 

k2 = Perm [j] + k2; 
) while (k2>Perm[j - 1]); 

j = i; 

do { 
if(kk<k2) 

goto Permute_Multi; 
kk+=jc; 
k2 += kspan; 
} while (k2 < ns); 
} while (kk < ns); 

} 

else 
{ . 

/* permutation for single-variate transform (optional code) */ 
Permute_Single: 
do { 

/* swap 

* Re Data (kk) o Re_Data (k2) 

* Im_Data (kk) o Im_Data (k2) 
*/ 

REAL t; 

t = Re_Data (kk); Re_Data (kk) = Re_Data (k2); Re_Data (k2) = t; 

t = Im_Data (kk); Im_Data (kk) = Im_Data (k2); Im_Data (k2) = t; 

kk += inc; 

k2 += kspan; 
} while (k2 < ns); 
do { 

do { 

k2 -=Perm (j - 1]; 

k2 = Perm [j] + k2; 
} while (k2>Pemi(j- 1]); 

do { 
if(kk<k2) 
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goto Pemiute_Siiigle; 
kk += inc; 
k2 += kspan; 
} while (k2 < ns); 
} while (kk < ns); 

} 

jc = k3; 

} 

if((kt « 1)+ 1 >= nFactor) 
return 0; 
ispan = Perm [kt]; 

/* permutation for square- free factors of n */ 
j = nFactor - kt; 
factor [j] = 1; 
do { 

factor (j - 1] += factor [j]; 

J--; 

} while 0 kt); 
im ~ factor [kt] - 1 ; 
kt++; 

if (nn > maxPerm) 
goto Memory_Error; 

j=jj = 0; 
for(;;){ 
int k2; 

k = kt+ 1; 

k2 = factor [kt - 1]; 

,kk = factor [k- 1]; 

j++; 

if(j >nn) 

break; /* exit infinite loop */ 

jj+=kk; 
while (jj >= k2) 

{ 

ii -= k2; 
k2 = kk; 

kk = factor [k++]; 
jj+=kk; 

} 

Perm (j - U 

} 

/* determine the permutation cycles of length greater than I */ 
j = 0; 
for(;;){ 
do { 

kk = Perm 0++]; 
} while (kk < 0); 
if(kk!=j) 

{ 

do { 
k = kk; 

kk = Perm[k- 1]; 

Perm [k - l] = -kk; 
} while (kk !=j); 
k3 = kk; 
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} 

else 

{ 

Perm (j - 1] = -j; 
ifa=nn) 

break; /* exit infinite loop */ 

} 

} 

max Factors *= inc; 

/* reorder a and b, following the permutation cycles */ 
fbr(;;) { 

j = k3+ 1; 

nt -= ispan; 

ii = nt - inc + 1; 

if(nt<0) 

break; /* exit infinite loop */ 

do { 

do { 

j--; 

} while (Perm [j - 1] <0); 
jj =jc; 
do { 
int k2; 

if (jj < maxFactors) kspan = jj; else kspan - maxFactors; 

jj -= kspan; 

k = Perm [j - 1]; 

kk =jc * k + ii + jj; 

kl = kk + kspan; 
k2 = 0; 
do { 

Rtmp [k2] = Re_Data (kl); 
Itmp [k2] = Im_Data (kl); 
k2++; 
kl -= inc; 
} while (kl != kk); 

do { 
kl = kk + kspan; 
k2-kl -jc*(k + Perm [k - 1]); 
k = -Perm[k- 1]; 
do { 

Re_Data (kl) = Re__Data (k2); 

Im Data (kl) = Im_Data (k2); 

kl -= inc; 

k2 -= inc; 
} while (kl !=kk); 
kk-k2; 
} while (k !=j); 

kl = kk + kspan; 
k2 = 0; 
do { 

Re_Data (kl) = Rtmp [k2]; 
Im_Data (kl) ^ Itnip [k2]; 
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k2++; 
kl -= inc; 
} while (kl !=kk); 
} while (jj); 
5 } while (j != 1); 

} 

return 0; /* exit point here */ 

/* alloc or other problem, do some clean-up */ 
1 0 Memory_Error: 

fprintf (stderr, "Error: " FFTRADIXS "() - insufficient memory .\n"); 
fft_free (); /* free-up memory */ 

return - 1 ; 

} 

15 #endif /* _FFTN_C */ 

/* end-of-file (C source) */ 
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/* *-C-* -* 

* File: 

* fftn.h 
* 

* Singleton's multivariate complex Fourier transform, computed in 

* place using mixed-radix Fast Fourier Transform algorithm. 
* 

* Called here Tfta' since it does a radix-n FFT on n-dimensional data 
* 

* Copyright(c) 1995,97 Mark Olesen <olesen@me.QueensU.CA> 

* Queen's Univ at Kingston (Canada) 
* 

* Permission to use, copy, modify, and distribute this sofUvare for 

* any purpose without fee is hereby granted, provided that this 

* entire notice is included in all copies of any software which is 

* or includes a copy or modification of this software and in all 

* copies of the supporting documentation for such software. 
* 

* THIS SOFTWARE IS BEING PROVIDED "AS IS", WITHOUT ANY EXPRESS OR 

* IMPLIED WARRANTY. IN PARTICULAR, NEITHER THE AUTHOR NOR QUEEN'S 

* UNIVERSITY AT KINGSTON MAKES ANY REPRESENTATION OR WARRANTY OF ANY 

* KIND CONCERNING THE MERCHANTABILITY OF THIS SOFTWARE OR ITS 

* FITNESS FOR ANY PARTICULAR PURPOSE. 
* 

* Ail of which is to say that you can do what you like with this 

* source code provided you don't try to sell it as your own and you 

* include an unaltered copy of this message (including the 

* copyright). 
* 

* It is also implicitly understood that bug fixes and improvements 

* should make their way back to the general Internet community so 

* that everyone benefits. 
* 

* Brief overview of parameters: 



* Re[]: real value array 

* Im[]: imaginary value array 

* nTotal: total number of complex values 

* nPass: number of elements involved in this pass of transfomi 

* nSpan: nspan/nPass = number of bytes to increment pointer 

* in Re[] and Im[] 

* isign: exponent: +1 = forward -1 - reverse 

* scaling: nomnalizing constant by which the final result is DIVIDED 

* scaling = -1, normalize by total dimension of the transform 

* scaling < - 1 , normalize by the square-root of the total dimension 

* Slightly more detailed information: 

* _ * 

* void fft_free (void); 
* 

* free-up allocated temporary storage after finished all the Fourier 

* transforms. 



* int fftn (int ndim. const int dims[], REAL Re[]. REAL Im[], 

* int iSign, double scaling); 
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* NDIM = the total number dimensions 

* DIMS = a vector of array sizes 

* if NDIM is zero then DIMS must be zero-terminated 

5 * RE and IM hold the real and imaginary components of the data, and 

* return the resulting real and imaginary Fourier coefficients. 

* Multidimensional data *must* be allocated contiguously. There is 

* no limit on the number of dimensions. 
♦ 

10 * ISIGN = the sign of the complex exponential 

* (ie, forward or inverse FFT) 

* the magnitude of ISIGN (normally 1) is used to determine 

* the correct indexing increment (see below). 
* 

15 * SCALING = normalizing constant by which the final result is DIVIDED 

* if SCALING = -1 , nonnalize by total dimension of the transform 

* if SCALING < -1, normalize by the square-root of the total dimension 
* 

* example: 

20 * tri-variate transform with Re[n3][n2][nl], Im[n3][n2][nl] 

* int dims[3] = {nl,n2,n3} 

* fftn (3, dims, Re, Im, t, scaling); 

25 * or, using a null terminated dimension list 

* intdims[4] = {nl,n2,n3,0} 

* fftn (0, dims. Re, Im, 1, scaling); 



#ifndef_FFTN_H 
30 #defme _FFTN_H 

#ifdef cplusplus 

extern "C" { 
#endif 
extern void fft_free (void); 

35 

/* double precision routine */ 
extem int fftn (int /* ndim */, 

const int /* dims */[], 
double /* Re */[], 
40 double /* Im */[], 

int /* isign */, 
double /* scaling */); 

/* float precision routine */ 
45 extem int fftnf (int /* ndim */, 

const int /* dims */[], 
float/* Re */[], 
float/* Im */[], 
int/* isign */, 
50 double /* scaling */); 

#ifdef ^ ^cplusplus 

} 

#endif 

55 #endif /* _FFTN_H */ 

/+ end-of-file (C header) - 
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